Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -18,32 +18,7 @@ pipeline_tag: text-generation
|
|
| 18 |
qunatized_by: twhoool02
|
| 19 |
---
|
| 20 |
|
| 21 |
-
# Model Card for
|
| 22 |
-
(model): LlamaForCausalLM(
|
| 23 |
-
(model): LlamaLikeModel(
|
| 24 |
-
(embedding): Embedding(32000, 4096)
|
| 25 |
-
(blocks): ModuleList(
|
| 26 |
-
(0-31): 32 x LlamaLikeBlock(
|
| 27 |
-
(norm_1): FasterTransformerRMSNorm()
|
| 28 |
-
(attn): QuantAttentionFused(
|
| 29 |
-
(qkv_proj): WQLinear_GEMM(in_features=4096, out_features=12288, bias=False, w_bit=4, group_size=128)
|
| 30 |
-
(o_proj): WQLinear_GEMM(in_features=4096, out_features=4096, bias=False, w_bit=4, group_size=128)
|
| 31 |
-
(rope): RoPE()
|
| 32 |
-
)
|
| 33 |
-
(norm_2): FasterTransformerRMSNorm()
|
| 34 |
-
(mlp): LlamaMLP(
|
| 35 |
-
(gate_proj): WQLinear_GEMM(in_features=4096, out_features=11008, bias=False, w_bit=4, group_size=128)
|
| 36 |
-
(up_proj): WQLinear_GEMM(in_features=4096, out_features=11008, bias=False, w_bit=4, group_size=128)
|
| 37 |
-
(down_proj): WQLinear_GEMM(in_features=11008, out_features=4096, bias=False, w_bit=4, group_size=128)
|
| 38 |
-
(act_fn): SiLU()
|
| 39 |
-
)
|
| 40 |
-
)
|
| 41 |
-
)
|
| 42 |
-
(norm): LlamaRMSNorm()
|
| 43 |
-
)
|
| 44 |
-
(lm_head): Linear(in_features=4096, out_features=32000, bias=False)
|
| 45 |
-
)
|
| 46 |
-
)
|
| 47 |
|
| 48 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 49 |
|
|
|
|
| 18 |
qunatized_by: twhoool02
|
| 19 |
---
|
| 20 |
|
| 21 |
+
# Model Card for Llama-2-7b-hf-AWQ
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 24 |
|