llama2.c

mirror of https://github.com/trholding/llama2.c.git synced 2026-02-06 11:26:53 +00:00

Author	SHA1	Message	Date
Andrej	1f8af82130	Merge branch 'master' into feature/int8_try2	2023-10-09 08:34:43 -07:00
Nicky Pochinkov	2dedad6cea	Added support for repeated kv weights	2023-09-21 16:38:06 +02:00
Nicky Pochinkov	d3c25b10a6	Add checks/config for tied embedding weights	2023-09-21 16:36:36 +02:00
Nicky Pochinkov	ffea287516	updated comment .pt -> .bin	2023-09-16 18:46:27 +01:00
Nicky Pochinkov	a61173d6b9	Added CLI dtype code	2023-09-16 18:32:31 +01:00
Nicky Pochinkov	19f40a2a71	Made default hf export torch.float32	2023-09-16 18:32:21 +01:00
Nicky Pochinkov	fc11cc387b	Changed code so that lm_head and token_embed are tied	2023-09-16 18:10:36 +01:00
Nicky Pochinkov	f38055dfb6	add option to set dtype for export	2023-09-16 14:07:48 +01:00
Nicky Pochinkov	bf9a1162e1	Added error handling for LlamaConfig import	2023-09-12 19:55:28 +01:00
Nicky Pochinkov	6360a53901	fixed whitespace	2023-09-12 19:53:26 +01:00
Nicky Pochinkov	c568f6952d	added option to export to huggingface format	2023-09-12 19:51:31 +01:00
atamyrat	6e52df9b41	properly handle token embeddings & shared classifier wcls	2023-08-27 08:18:03 +03:00
atamyrat	f850a97c6a	draft refactor to use QuantizedTensor in function arguments	2023-08-27 06:05:20 +03:00
Andrej Karpathy	df80471914	draft of int8 attempt number two	2023-08-26 22:28:08 +00:00
Jani Monoses	2c2b284988	Get vocab_size from token embeddings size	2023-08-26 22:35:55 +03:00
atamyrat	de005474d3	Added load_meta_model() to export.py	2023-08-21 14:13:47 +03:00
Andrej Karpathy	dd61b13e57	delete the save_torchscript export file, but copy its content to the new export.py for the future maybe	2023-08-21 05:09:06 +00:00
atamyrat	0dd82158f6	removed transformers from requirements.txt, added error message	2023-08-21 06:07:29 +03:00
atamyrat	155475a523	Fix WQ and WK permutation in huggingface models	2023-08-21 05:16:11 +03:00
atamyrat	09db52c69e	Added huggingface model loader to export.py	2023-08-21 02:59:12 +03:00
Andrej Karpathy	f3db92a2dc	use out_file.tell() instead of nbytes += arithmetic	2023-08-20 16:51:35 +00:00
Andrej Karpathy	4df5e2e939	make version 1 be the legacy export but with new header. version 2 will be Q8_0 export	2023-08-19 18:51:32 +00:00
Andrej Karpathy	4212bd6d43	oops fix double indent on quantize def	2023-08-19 18:34:49 +00:00
Andrej Karpathy	7f551dbfd7	new model export: versions 0 (legacy) and 1	2023-08-19 18:25:20 +00:00

24 Commits