inferencerlabs commited on
Commit
21c5d89
·
verified ·
1 Parent(s): 8cc2b62

Upload complete model

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -9,8 +9,6 @@ base_model: zai-org/GLM-4.6
9
  tags:
10
  - mlx
11
  ---
12
- ** CURRENTLY UPLOADING **
13
-
14
  **See GLM-4.6 6.5bit MLX in action - [demonstration video - coming soon](https://www.youtube.com/xcreate)**
15
 
16
  *q6.5bit quant typically achieves the highest perplexity in our testing*
@@ -26,7 +24,7 @@ tags:
26
  ## Usage Notes
27
 
28
  * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
29
- * Memory usage: ~360 GB
30
  * Expect ~16 tokens/s
31
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.27
32
  * For more details see [demonstration video - coming soon](https://www.youtube.com/xcreate) or visit [GLM-4.6](https://huggingface.co/zai-org/GLM-4.6).
 
9
  tags:
10
  - mlx
11
  ---
 
 
12
  **See GLM-4.6 6.5bit MLX in action - [demonstration video - coming soon](https://www.youtube.com/xcreate)**
13
 
14
  *q6.5bit quant typically achieves the highest perplexity in our testing*
 
24
  ## Usage Notes
25
 
26
  * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
27
+ * Memory usage: ~270 GB
28
  * Expect ~16 tokens/s
29
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.27
30
  * For more details see [demonstration video - coming soon](https://www.youtube.com/xcreate) or visit [GLM-4.6](https://huggingface.co/zai-org/GLM-4.6).