Vulcan
|
2a52e9d292
|
Llama 3.1 Support
README.md - Added examples and docs for Llama 3.1 usage
run / runq - Llama 3.1 is supported as Llama 3 is supported
|
2024-07-24 01:56:56 +05:30 |
|
Vulcan
|
3d9ae22541
|
Update run and runq
run - mirror changes to runq
|
2024-07-20 21:35:09 +05:30 |
|
Vulcan
|
e842bf7118
|
Update runq.c
runq - moarrr openmp/openacc parallel loops
|
2024-07-20 20:53:25 +05:30 |
|
Vulcan
|
1c47da5ebf
|
Update runq.c
runq - speed up rmsnorm with OpenMP / OpenACC
|
2024-07-20 19:47:46 +05:30 |
|
Vulcan
|
16e223fbca
|
Update runq.c
runq - Undo #pragma omp parallel sections for matmuls for now as there is no real benefit with low number of cores
|
2024-07-20 19:20:30 +05:30 |
|
Vulcan
|
725faaa608
|
Update runq.c
|
2024-07-20 19:14:56 +05:30 |
|
Vulcan
|
fae1157b0b
|
runq - Add OpenMP parallel regions
runq - Experiment to verify speed up matmuls with OpenMP parallel sections
Ref: https://github.com/karpathy/llama2.c/pull/75
|
2024-07-20 19:08:18 +05:30 |
|
Vulcan
|
036d7cb9f2
|
runq - remove blas & optimize
runq - optimize matmul and quantization functions with OpenMP
|
2024-07-20 17:44:29 +05:30 |
|
Vulcan
|
8458b68338
|
runq and runc tiny fixes
runq - add blas for matmul
|
2024-07-19 14:57:19 +05:30 |
|
Vulcan
|
e893f18a36
|
Support Llama3 8bit quantized inference
runq - add llama3 support
|
2024-07-12 11:52:03 +05:30 |
|
Vulcan
|
4d6452ed5b
|
Makefile: LLVM BOLT Support
- Makefile: Add LLVM BOLT build
Usage:
make BOLTPREP=1 <target> ; make run_bolt
- run.c / runq.c : Enable exit command in prompt in embedded model builds
- README.md: Update usage
|
2024-04-05 21:37:48 +05:30 |
|
Vulcan
|
d62525d980
|
runq.c - Disabled cblas matmul
May need invasive rewrite for 8bit quant. Won't fix.
|
2024-03-20 17:32:16 +05:30 |
|
Vulcan
|
dd82c76dce
|
L2Efy runq.c
TODO:
- BLAS builds are broken
- Add to Makefile
|
2024-03-20 16:43:04 +05:30 |
|
Andrej
|
e0eb8b29ab
|
Merge pull request #444 from maxbbraun/patch-1
Fix typo in runq.c comment
|
2024-02-12 17:21:08 -08:00 |
|
digger yu
|
2fbf7059aa
|
fix some typo
|
2023-11-28 18:09:22 +08:00 |
|
Max Braun
|
c760ae6171
|
Fix typo in runq.c comment
|
2023-11-11 19:00:00 -08:00 |
|
Andrej Karpathy
|
b233b77058
|
add some docs for runq
|
2023-10-09 16:35:51 +00:00 |
|
atamyrat
|
6e52df9b41
|
properly handle token embeddings & shared classifier wcls
|
2023-08-27 08:18:03 +03:00 |
|
atamyrat
|
06175b946b
|
free() quantizedtensors
|
2023-08-27 06:47:03 +03:00 |
|
atamyrat
|
f850a97c6a
|
draft refactor to use QuantizedTensor in function arguments
|
2023-08-27 06:05:20 +03:00 |
|
Andrej Karpathy
|
df80471914
|
draft of int8 attempt number two
|
2023-08-26 22:28:08 +00:00 |
|