Gradient Masks Repository
This repository contains gradient masks generated during training of model.
Overview
Gradient masks are boolean tensors that indicate which parameters have significant gradients during training. These masks can be used to identify important parameters for fine-tuning or to create sparse models.
Usage
from huggingface_hub import hf_hub_download
import torch
# Download masks for a specific step
mask_path = hf_hub_download(
    repo_id="israel-adewuyi/Qwen2.5-0.5B-Instruct-grad_masks",
    filename="masks/beta_<beta>_step_<step>_tolerance_<tolerance>.pt"
)
masks = torch.load(mask_path, map_location="cpu")
# Apply masks to a model
for name, param in model.named_parameters():
    if name in masks:
        mask = masks[name].to(param.device)
        param.requires_grad = mask  # Set requires_grad based on mask
Mask Generation
- Tolerance: [1e-05, 1e-06, 1e-07]
 - Beta (EMA decay): [90, 95, 99]
 - Epsilon: 1e-08
 - Base Model: model
 
Files
masks/beta_<beta>_step_<step>_tolerance_<tolerance>.pt: Boolean masks for each training stepmetadata/beta_<beta>_step_<step>_tolerance_<tolerance>_info.json: Metadata about each mask set
License
This repository is licensed under the MIT License.
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support