llama2.c

mirror of https://github.com/trholding/llama2.c.git synced 2026-02-06 11:26:53 +00:00

Author	SHA1	Message	Date
Vulcan	2a52e9d292	Llama 3.1 Support README.md - Added examples and docs for Llama 3.1 usage run / runq - Llama 3.1 is supported as Llama 3 is supported	2024-07-24 01:56:56 +05:30
Vulcan	3d9ae22541	Update run and runq run - mirror changes to runq	2024-07-20 21:35:09 +05:30
Vulcan	e842bf7118	Update runq.c runq - moarrr openmp/openacc parallel loops	2024-07-20 20:53:25 +05:30
Vulcan	1c47da5ebf	Update runq.c runq - speed up rmsnorm with OpenMP / OpenACC	2024-07-20 19:47:46 +05:30
Vulcan	16e223fbca	Update runq.c runq - Undo #pragma omp parallel sections for matmuls for now as there is no real benefit with low number of cores	2024-07-20 19:20:30 +05:30
Vulcan	725faaa608	Update runq.c	2024-07-20 19:14:56 +05:30
Vulcan	fae1157b0b	runq - Add OpenMP parallel regions runq - Experiment to verify speed up matmuls with OpenMP parallel sections Ref: https://github.com/karpathy/llama2.c/pull/75	2024-07-20 19:08:18 +05:30
Vulcan	036d7cb9f2	runq - remove blas & optimize runq - optimize matmul and quantization functions with OpenMP	2024-07-20 17:44:29 +05:30
Vulcan	8458b68338	runq and runc tiny fixes runq - add blas for matmul	2024-07-19 14:57:19 +05:30
Vulcan	e893f18a36	Support Llama3 8bit quantized inference runq - add llama3 support	2024-07-12 11:52:03 +05:30
Vulcan	4d6452ed5b	Makefile: LLVM BOLT Support - Makefile: Add LLVM BOLT build Usage: make BOLTPREP=1 <target> ; make run_bolt - run.c / runq.c : Enable exit command in prompt in embedded model builds - README.md: Update usage	2024-04-05 21:37:48 +05:30
Vulcan	d62525d980	runq.c - Disabled cblas matmul May need invasive rewrite for 8bit quant. Won't fix.	2024-03-20 17:32:16 +05:30
Vulcan	dd82c76dce	L2Efy runq.c TODO: - BLAS builds are broken - Add to Makefile	2024-03-20 16:43:04 +05:30
Andrej	e0eb8b29ab	Merge pull request #444 from maxbbraun/patch-1 Fix typo in runq.c comment	2024-02-12 17:21:08 -08:00
digger yu	2fbf7059aa	fix some typo	2023-11-28 18:09:22 +08:00
Max Braun	c760ae6171	Fix typo in runq.c comment	2023-11-11 19:00:00 -08:00
Andrej Karpathy	b233b77058	add some docs for runq	2023-10-09 16:35:51 +00:00
atamyrat	6e52df9b41	properly handle token embeddings & shared classifier wcls	2023-08-27 08:18:03 +03:00
atamyrat	06175b946b	free() quantizedtensors	2023-08-27 06:47:03 +03:00
atamyrat	f850a97c6a	draft refactor to use QuantizedTensor in function arguments	2023-08-27 06:05:20 +03:00
Andrej Karpathy	df80471914	draft of int8 attempt number two	2023-08-26 22:28:08 +00:00

21 Commits