Commit Graph

658 Commits

Author SHA1 Message Date
Andrej Karpathy
91d57db925 add note on code llama being a bit wrong 2023-08-26 21:22:19 +00:00
Andrej
f856539f41
Merge pull request #363 from byte-6174/patch-1
fix tinyllamas url
2023-08-26 14:13:20 -07:00
byte-6174
b5a0b65dbf
fix tinyllamas url 2023-08-26 17:05:21 -04:00
Andrej
7b0017c6cd
Merge pull request #362 from byte-6174/upmaster
freeing tokenizer in test.c
2023-08-26 14:03:31 -07:00
Andrej Karpathy
50832e3dff move script into the new docs folder 2023-08-26 21:02:23 +00:00
Andrej Karpathy
1386edfd90 add docs on stories260K 2023-08-26 20:52:49 +00:00
Aniket
32cecbfe4a freeing tokenizer in test.c 2023-08-26 16:35:50 -04:00
Vulcan
57a807130a Makefile target for termux
Added make run_incbin target for termux on Android

Usage:

In termux do:

pkg upgrade
pkg install git
pkg install make
pkg install clang
pkg install wget
git clone https://github.com/trholding/llama2.c
cd llama2.c
make run_incbin
./run

Ref: https://github.com/trholding/llama2.c/issues/7
2023-08-27 02:00:06 +05:30
Andrej
e47bacdc62
Merge pull request #355 from janimo/export-vocab-size
Export vocab size and Code Llama usage docs
2023-08-26 13:24:55 -07:00
Vulcan
551b166ba2 Makefile updates
Added changes from upstream
Added list target to get a list of targets

Usage:
make list
2023-08-27 01:27:41 +05:30
Jani Monoses
604d3c59c0 Add Code Llama info 2023-08-26 22:36:09 +03:00
Jani Monoses
2c2b284988 Get vocab_size from token embeddings size 2023-08-26 22:35:55 +03:00
Vulcan
a124fd34ea Merge remote-tracking branch 'upstream/master' 2023-08-26 23:38:57 +05:30
Vulcan
d311a72304 Makefile fix for missing target
Added the missing run_cosmocc target
2023-08-26 23:27:58 +05:30
Vulcan
766ff6aa35
Update README.md 2023-08-26 22:47:49 +05:30
Michael Shalala
7ec4f6a1df
Update README.md - Adding F# Port
An F# port. Uses vectorization and parallelism and is really fast.
2023-08-26 08:50:55 +03:00
Vulcan
d3b97f45b6
Update README.md 2023-08-26 10:06:54 +05:30
Vulcan
b508909609 Unikernel fixes
Makefile and run.c fixes.
2023-08-26 03:24:04 +05:30
Vulcan
f2fee04b35 unikernel support (WIP)
Usage:

make run_unik_qemu_x86_64

Run with:

qemu-system-x86_64 -m 256m -accel kvm -kernel build/L2E_qemu-x86_64
2023-08-26 01:38:15 +05:30
Andrej
49daf18f2f
Merge pull request #343 from karpathy/feature/chat
Add interactive loop to enable nice chat with a Llama 2 Chat model
2023-08-25 08:00:11 -07:00
Andrej
4a7a62bd21
Merge branch 'master' into feature/chat 2023-08-25 07:58:33 -07:00
Andrej
5c6427e4d7
Merge pull request #352 from dmarcos/readmeTypo
Fix typo in README.md
2023-08-25 07:55:54 -07:00
Andrej
cbc2488b82
Merge pull request #353 from photomz/master
Clearer WandB log step
2023-08-25 07:55:26 -07:00
Andrej Karpathy
fbe324fc5a adjust things a bit 2023-08-25 14:54:05 +00:00
Markus Zhang
6def77d4ba
Correct WandB log step 2023-08-25 17:12:29 +08:00
Diego Marcos Segura
19cfbeca71 Fix typo in README.md 2023-08-24 19:46:43 -07:00
Andrej
d7cd98633d
add todo item to add a PyTorch Engine 2023-08-24 09:04:52 -07:00
Andrej Karpathy
3d787b2463 ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such 2023-08-24 04:31:06 +00:00
Andrej Karpathy
40fb902cf0 fix chat format bug i think 2023-08-24 03:33:44 +00:00
Andrej Karpathy
c7a26264a2 Merge branch 'master' of github.com:karpathy/llama2.c 2023-08-24 03:10:18 +00:00
Andrej Karpathy
446c1c0df3 Merge branch 'janimo-train-vocab-python' 2023-08-24 03:10:07 +00:00
Andrej Karpathy
096325b66c bring back num_threads 2023-08-24 03:09:55 +00:00
Andrej
90104db721
Merge pull request #348 from nehzata/clip_steps
Clip steps maximum value
2023-08-23 19:57:01 -07:00
Ali Nehzat
9bc72acab0
steps shouldn't exceed the model's seq_len either 2023-08-24 09:09:16 +10:00
Andrej Karpathy
c5e0e7fce4 attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now 2023-08-23 16:27:48 +00:00
Jani Monoses
fe9b9f2f15 Train vocab in Python 2023-08-23 19:10:28 +03:00
Vulcan
d588cc1289 Various small improvements
See README.md
2023-08-23 19:26:40 +05:30
Andrej Karpathy
7ac65cb2c2 make decode safer and fix issue with skipping bad byte tokens 2023-08-23 01:08:31 +00:00
Andrej Karpathy
4b3e66021a lol text 2023-08-23 00:26:47 +00:00
Andrej Karpathy
d1eb18b8ec add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability 2023-08-23 00:08:22 +00:00
Vulcan
e0c13d9307 Added loop to prompt, INCBIN & strliteral embedding
New ways to embed models via INCBIN and strliteral

Enabled loop back to prompt  on some targets.

Available via make:
runompincbin
runompstrlit

Cosmopolitan toolchain make targets:
runcincbin
runcstrlit

Loop back to prompt on cosmopolitan make targets are WIP and do not work as expected.

Documentation: TODO
2023-08-22 11:51:38 +05:30
Andrej Karpathy
d26a499207 absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful. 2023-08-22 03:22:56 +00:00
Andrej Karpathy
ac6cf8d6e8 tweak todo list 2023-08-22 02:48:51 +00:00
Andrej Karpathy
ad7a1ef525 clean up swiglu a little bit 2023-08-22 02:32:21 +00:00
Andrej Karpathy
0e362f735f and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close 2023-08-22 02:22:36 +00:00
Andrej Karpathy
d73b917d3b hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner 2023-08-22 02:17:51 +00:00
Andrej Karpathy
379f083b85 make sorted vocab a buffer of Tokenizer 2023-08-22 01:56:51 +00:00
Andrej
5eaca535cd
Merge pull request #335 from ozabluda/ozabluda-patch-5
Remove unneeded check of free(NULL)
2023-08-21 18:16:07 -07:00
Andrej Karpathy
83287ff254 fix steps=0 is max context 2023-08-22 01:15:00 +00:00
Vulcan
06f25f6db5 Merge remote-tracking branch 'upstream/master' 2023-08-22 00:00:59 +05:30