Andrej Karpathy
91d57db925
add note on code llama being a bit wrong
2023-08-26 21:22:19 +00:00
Andrej
f856539f41
Merge pull request #363 from byte-6174/patch-1
...
fix tinyllamas url
2023-08-26 14:13:20 -07:00
byte-6174
b5a0b65dbf
fix tinyllamas url
2023-08-26 17:05:21 -04:00
Andrej
7b0017c6cd
Merge pull request #362 from byte-6174/upmaster
...
freeing tokenizer in test.c
2023-08-26 14:03:31 -07:00
Andrej Karpathy
50832e3dff
move script into the new docs folder
2023-08-26 21:02:23 +00:00
Andrej Karpathy
1386edfd90
add docs on stories260K
2023-08-26 20:52:49 +00:00
Aniket
32cecbfe4a
freeing tokenizer in test.c
2023-08-26 16:35:50 -04:00
Vulcan
57a807130a
Makefile target for termux
...
Added make run_incbin target for termux on Android
Usage:
In termux do:
pkg upgrade
pkg install git
pkg install make
pkg install clang
pkg install wget
git clone https://github.com/trholding/llama2.c
cd llama2.c
make run_incbin
./run
Ref: https://github.com/trholding/llama2.c/issues/7
2023-08-27 02:00:06 +05:30
Andrej
e47bacdc62
Merge pull request #355 from janimo/export-vocab-size
...
Export vocab size and Code Llama usage docs
2023-08-26 13:24:55 -07:00
Vulcan
551b166ba2
Makefile updates
...
Added changes from upstream
Added list target to get a list of targets
Usage:
make list
2023-08-27 01:27:41 +05:30
Jani Monoses
604d3c59c0
Add Code Llama info
2023-08-26 22:36:09 +03:00
Jani Monoses
2c2b284988
Get vocab_size from token embeddings size
2023-08-26 22:35:55 +03:00
Vulcan
a124fd34ea
Merge remote-tracking branch 'upstream/master'
2023-08-26 23:38:57 +05:30
Vulcan
d311a72304
Makefile fix for missing target
...
Added the missing run_cosmocc target
2023-08-26 23:27:58 +05:30
Vulcan
766ff6aa35
Update README.md
2023-08-26 22:47:49 +05:30
Michael Shalala
7ec4f6a1df
Update README.md - Adding F# Port
...
An F# port. Uses vectorization and parallelism and is really fast.
2023-08-26 08:50:55 +03:00
Vulcan
d3b97f45b6
Update README.md
2023-08-26 10:06:54 +05:30
Vulcan
b508909609
Unikernel fixes
...
Makefile and run.c fixes.
2023-08-26 03:24:04 +05:30
Vulcan
f2fee04b35
unikernel support (WIP)
...
Usage:
make run_unik_qemu_x86_64
Run with:
qemu-system-x86_64 -m 256m -accel kvm -kernel build/L2E_qemu-x86_64
2023-08-26 01:38:15 +05:30
Andrej
49daf18f2f
Merge pull request #343 from karpathy/feature/chat
...
Add interactive loop to enable nice chat with a Llama 2 Chat model
2023-08-25 08:00:11 -07:00
Andrej
4a7a62bd21
Merge branch 'master' into feature/chat
2023-08-25 07:58:33 -07:00
Andrej
5c6427e4d7
Merge pull request #352 from dmarcos/readmeTypo
...
Fix typo in README.md
2023-08-25 07:55:54 -07:00
Andrej
cbc2488b82
Merge pull request #353 from photomz/master
...
Clearer WandB log step
2023-08-25 07:55:26 -07:00
Andrej Karpathy
fbe324fc5a
adjust things a bit
2023-08-25 14:54:05 +00:00
Markus Zhang
6def77d4ba
Correct WandB log step
2023-08-25 17:12:29 +08:00
Diego Marcos Segura
19cfbeca71
Fix typo in README.md
2023-08-24 19:46:43 -07:00
Andrej
d7cd98633d
add todo item to add a PyTorch Engine
2023-08-24 09:04:52 -07:00
Andrej Karpathy
3d787b2463
ok getting closer, and manually verified correctness of the schema matching python. still some weirdness in the printing to chase down, and also have to tune the buffer lengths and make them sensible and such
2023-08-24 04:31:06 +00:00
Andrej Karpathy
40fb902cf0
fix chat format bug i think
2023-08-24 03:33:44 +00:00
Andrej Karpathy
c7a26264a2
Merge branch 'master' of github.com:karpathy/llama2.c
2023-08-24 03:10:18 +00:00
Andrej Karpathy
446c1c0df3
Merge branch 'janimo-train-vocab-python'
2023-08-24 03:10:07 +00:00
Andrej Karpathy
096325b66c
bring back num_threads
2023-08-24 03:09:55 +00:00
Andrej
90104db721
Merge pull request #348 from nehzata/clip_steps
...
Clip steps maximum value
2023-08-23 19:57:01 -07:00
Ali Nehzat
9bc72acab0
steps shouldn't exceed the model's seq_len either
2023-08-24 09:09:16 +10:00
Andrej Karpathy
c5e0e7fce4
attempt at chat function, but it was 8AM and I didn't have coffee yet. Seems to work but it's probably subtly broken or too complex. version 1 only, lots of hard-coded non-sensical buffer sizes. Have to go to work now
2023-08-23 16:27:48 +00:00
Jani Monoses
fe9b9f2f15
Train vocab in Python
2023-08-23 19:10:28 +03:00
Vulcan
d588cc1289
Various small improvements
...
See README.md
2023-08-23 19:26:40 +05:30
Andrej Karpathy
7ac65cb2c2
make decode safer and fix issue with skipping bad byte tokens
2023-08-23 01:08:31 +00:00
Andrej Karpathy
4b3e66021a
lol text
2023-08-23 00:26:47 +00:00
Andrej Karpathy
d1eb18b8ec
add BOS and EOS function to the Tokenizer as we start to converge closer to the Llama 2 code from Meta, and as we're about to add the Chat capability
2023-08-23 00:08:22 +00:00
Vulcan
e0c13d9307
Added loop to prompt, INCBIN & strliteral embedding
...
New ways to embed models via INCBIN and strliteral
Enabled loop back to prompt on some targets.
Available via make:
runompincbin
runompstrlit
Cosmopolitan toolchain make targets:
runcincbin
runcstrlit
Loop back to prompt on cosmopolitan make targets are WIP and do not work as expected.
Documentation: TODO
2023-08-22 11:51:38 +05:30
Andrej Karpathy
d26a499207
absorb our rng state into the Sampler. I feel that this is correct because it makes our use of entropy very explicit and localized, and the sampler is now well-contained without any global state. Code is increasingly more beautiful.
2023-08-22 03:22:56 +00:00
Andrej Karpathy
ac6cf8d6e8
tweak todo list
2023-08-22 02:48:51 +00:00
Andrej Karpathy
ad7a1ef525
clean up swiglu a little bit
2023-08-22 02:32:21 +00:00
Andrej Karpathy
0e362f735f
and finallygit add run.c split off the generate function. alongside it will come a chat function. we are close
2023-08-22 02:22:36 +00:00
Andrej Karpathy
d73b917d3b
hide temperature and topp into the sampler, it's a little bit less flexible but a little bit more cleaner
2023-08-22 02:17:51 +00:00
Andrej Karpathy
379f083b85
make sorted vocab a buffer of Tokenizer
2023-08-22 01:56:51 +00:00
Andrej
5eaca535cd
Merge pull request #335 from ozabluda/ozabluda-patch-5
...
Remove unneeded check of free(NULL)
2023-08-21 18:16:07 -07:00
Andrej Karpathy
83287ff254
fix steps=0 is max context
2023-08-22 01:15:00 +00:00
Vulcan
06f25f6db5
Merge remote-tracking branch 'upstream/master'
2023-08-22 00:00:59 +05:30