Andrej
|
1f8af82130
|
Merge branch 'master' into feature/int8_try2
|
2023-10-09 08:34:43 -07:00 |
|
Nicky Pochinkov
|
2dedad6cea
|
Added support for repeated kv weights
|
2023-09-21 16:38:06 +02:00 |
|
Nicky Pochinkov
|
d3c25b10a6
|
Add checks/config for tied embedding weights
|
2023-09-21 16:36:36 +02:00 |
|
Nicky Pochinkov
|
ffea287516
|
updated comment .pt -> .bin
|
2023-09-16 18:46:27 +01:00 |
|
Nicky Pochinkov
|
a61173d6b9
|
Added CLI dtype code
|
2023-09-16 18:32:31 +01:00 |
|
Nicky Pochinkov
|
19f40a2a71
|
Made default hf export torch.float32
|
2023-09-16 18:32:21 +01:00 |
|
Nicky Pochinkov
|
fc11cc387b
|
Changed code so that lm_head and token_embed are tied
|
2023-09-16 18:10:36 +01:00 |
|
Nicky Pochinkov
|
f38055dfb6
|
add option to set dtype for export
|
2023-09-16 14:07:48 +01:00 |
|
Nicky Pochinkov
|
bf9a1162e1
|
Added error handling for LlamaConfig import
|
2023-09-12 19:55:28 +01:00 |
|
Nicky Pochinkov
|
6360a53901
|
fixed whitespace
|
2023-09-12 19:53:26 +01:00 |
|
Nicky Pochinkov
|
c568f6952d
|
added option to export to huggingface format
|
2023-09-12 19:51:31 +01:00 |
|
atamyrat
|
6e52df9b41
|
properly handle token embeddings & shared classifier wcls
|
2023-08-27 08:18:03 +03:00 |
|
atamyrat
|
f850a97c6a
|
draft refactor to use QuantizedTensor in function arguments
|
2023-08-27 06:05:20 +03:00 |
|
Andrej Karpathy
|
df80471914
|
draft of int8 attempt number two
|
2023-08-26 22:28:08 +00:00 |
|
Jani Monoses
|
2c2b284988
|
Get vocab_size from token embeddings size
|
2023-08-26 22:35:55 +03:00 |
|
atamyrat
|
de005474d3
|
Added load_meta_model() to export.py
|
2023-08-21 14:13:47 +03:00 |
|
Andrej Karpathy
|
dd61b13e57
|
delete the save_torchscript export file, but copy its content to the new export.py for the future maybe
|
2023-08-21 05:09:06 +00:00 |
|
atamyrat
|
0dd82158f6
|
removed transformers from requirements.txt, added error message
|
2023-08-21 06:07:29 +03:00 |
|
atamyrat
|
155475a523
|
Fix WQ and WK permutation in huggingface models
|
2023-08-21 05:16:11 +03:00 |
|
atamyrat
|
09db52c69e
|
Added huggingface model loader to export.py
|
2023-08-21 02:59:12 +03:00 |
|
Andrej Karpathy
|
f3db92a2dc
|
use out_file.tell() instead of nbytes += arithmetic
|
2023-08-20 16:51:35 +00:00 |
|
Andrej Karpathy
|
4df5e2e939
|
make version 1 be the legacy export but with new header. version 2 will be Q8_0 export
|
2023-08-19 18:51:32 +00:00 |
|
Andrej Karpathy
|
4212bd6d43
|
oops fix double indent on quantize def
|
2023-08-19 18:34:49 +00:00 |
|
Andrej Karpathy
|
7f551dbfd7
|
new model export: versions 0 (legacy) and 1
|
2023-08-19 18:25:20 +00:00 |
|