inferencerlabs
/

GLM-4.6-MLX-6.5bit

Text Generation

Model card Files Files and versions

inferencerlabs commited on Oct 1

Commit

21c5d89

·

verified ·

1 Parent(s): 8cc2b62

Upload complete model

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -9,8 +9,6 @@ base_model: zai-org/GLM-4.6
 tags:
 - mlx
 ---
-** CURRENTLY UPLOADING **
 **See GLM-4.6 6.5bit MLX in action - [demonstration video - coming soon](https://www.youtube.com/xcreate)**
 *q6.5bit quant typically achieves the highest perplexity in our testing*
@@ -26,7 +24,7 @@ tags:
 ## Usage Notes
 * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
-* Memory usage: ~360 GB
 * Expect ~16 tokens/s
 * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.27
 * For more details see [demonstration video - coming soon](https://www.youtube.com/xcreate) or visit [GLM-4.6](https://huggingface.co/zai-org/GLM-4.6).

 tags:
 - mlx
 ---
 **See GLM-4.6 6.5bit MLX in action - [demonstration video - coming soon](https://www.youtube.com/xcreate)**
 *q6.5bit quant typically achieves the highest perplexity in our testing*
 ## Usage Notes
 * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
+* Memory usage: ~270 GB
 * Expect ~16 tokens/s
 * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.27
 * For more details see [demonstration video - coming soon](https://www.youtube.com/xcreate) or visit [GLM-4.6](https://huggingface.co/zai-org/GLM-4.6).