Commit Graph

24 Commits

Author SHA1 Message Date
Andrej
1f8af82130
Merge branch 'master' into feature/int8_try2 2023-10-09 08:34:43 -07:00
Nicky Pochinkov
2dedad6cea Added support for repeated kv weights 2023-09-21 16:38:06 +02:00
Nicky Pochinkov
d3c25b10a6 Add checks/config for tied embedding weights 2023-09-21 16:36:36 +02:00
Nicky Pochinkov
ffea287516 updated comment .pt -> .bin 2023-09-16 18:46:27 +01:00
Nicky Pochinkov
a61173d6b9 Added CLI dtype code 2023-09-16 18:32:31 +01:00
Nicky Pochinkov
19f40a2a71 Made default hf export torch.float32 2023-09-16 18:32:21 +01:00
Nicky Pochinkov
fc11cc387b Changed code so that lm_head and token_embed are tied 2023-09-16 18:10:36 +01:00
Nicky Pochinkov
f38055dfb6 add option to set dtype for export 2023-09-16 14:07:48 +01:00
Nicky Pochinkov
bf9a1162e1 Added error handling for LlamaConfig import 2023-09-12 19:55:28 +01:00
Nicky Pochinkov
6360a53901 fixed whitespace 2023-09-12 19:53:26 +01:00
Nicky Pochinkov
c568f6952d added option to export to huggingface format 2023-09-12 19:51:31 +01:00
atamyrat
6e52df9b41 properly handle token embeddings & shared classifier wcls 2023-08-27 08:18:03 +03:00
atamyrat
f850a97c6a draft refactor to use QuantizedTensor in function arguments 2023-08-27 06:05:20 +03:00
Andrej Karpathy
df80471914 draft of int8 attempt number two 2023-08-26 22:28:08 +00:00
Jani Monoses
2c2b284988 Get vocab_size from token embeddings size 2023-08-26 22:35:55 +03:00
atamyrat
de005474d3 Added load_meta_model() to export.py 2023-08-21 14:13:47 +03:00
Andrej Karpathy
dd61b13e57 delete the save_torchscript export file, but copy its content to the new export.py for the future maybe 2023-08-21 05:09:06 +00:00
atamyrat
0dd82158f6 removed transformers from requirements.txt, added error message 2023-08-21 06:07:29 +03:00
atamyrat
155475a523 Fix WQ and WK permutation in huggingface models 2023-08-21 05:16:11 +03:00
atamyrat
09db52c69e Added huggingface model loader to export.py 2023-08-21 02:59:12 +03:00
Andrej Karpathy
f3db92a2dc use out_file.tell() instead of nbytes += arithmetic 2023-08-20 16:51:35 +00:00
Andrej Karpathy
4df5e2e939 make version 1 be the legacy export but with new header. version 2 will be Q8_0 export 2023-08-19 18:51:32 +00:00
Andrej Karpathy
4212bd6d43 oops fix double indent on quantize def 2023-08-19 18:34:49 +00:00
Andrej Karpathy
7f551dbfd7 new model export: versions 0 (legacy) and 1 2023-08-19 18:25:20 +00:00