Commit Graph

658 Commits

Author SHA1 Message Date
Vulcan
548bf5df95 Rebase (WIP)
Do not clone
2023-08-21 23:53:31 +05:30
Oleg Zabluda
c2834c8a1f
Remove unneeded check of free(NULL)
Passing NULL to free() is totally allowed
2023-08-21 10:54:53 -07:00
Andrej
ee95b1bf29
Merge pull request #315 from davidar/vocab_source
Fix vocab_source in sample.py
2023-08-21 08:26:28 -07:00
Andrej Karpathy
d02e0c90d8 Merge branch 'rdentato-patch-check-params' 2023-08-21 15:17:37 +00:00
Andrej Karpathy
33d94f60a5 parameter validation cleanup 2023-08-21 15:17:14 +00:00
Remo Dentato
2d972f1763
Merge branch 'karpathy:master' into patch-check-params 2023-08-21 17:02:42 +02:00
Andrej
8a3ea7b433
Merge pull request #329 from atamurad/import_meta
Moved export_meta_llama_bin.py to new export.py
2023-08-21 07:34:32 -07:00
atamyrat
61c26d5392 Updated README to replace export_meta_llama_bin.py script with export.py 2023-08-21 14:24:01 +03:00
atamyrat
36a78af5e1 tested load_meta_model() in export.py, deleting old export_meta_llama_bin.py file 2023-08-21 14:19:56 +03:00
atamyrat
de005474d3 Added load_meta_model() to export.py 2023-08-21 14:13:47 +03:00
Vulcan
410e17ea67 Merge remote-tracking branch 'upstream/master' 2023-08-21 16:37:10 +05:30
Vulcan
fc2e5aebd4 Update run.c for manual rebase (WIP)
Do not clone (WIP)
2023-08-21 16:36:31 +05:30
Vulcan
2e62eacfbc Baremetal/Portable Embedding (WIP)
Embedding model, tokenizer & assets for baremetal and portable builds (WIP)
2023-08-21 15:18:33 +05:30
rdentato
4444575c4e Added check of generation parameters. 2023-08-21 06:43:39 +00:00
Andrej Karpathy
dd61b13e57 delete the save_torchscript export file, but copy its content to the new export.py for the future maybe 2023-08-21 05:09:06 +00:00
Andrej Karpathy
ea44f53568 now that the export.py HF functionality is in master, we can delete this file, and update the readme 2023-08-21 04:58:19 +00:00
Andrej
801c68f5a1
Merge pull request #326 from atamurad/import_hf
Added huggingface model loader/importer to export.py
2023-08-20 21:53:17 -07:00
Andrej
74a68eeb35
Merge pull request #325 from HarryGifford/users/hegi/update-readme-threading
Update readme with suggestion on number of threads to use
2023-08-20 21:50:26 -07:00
Andrej Karpathy
288b3cec09 remove dagger in the eyeball 2023-08-21 04:47:49 +00:00
Andrej Karpathy
14275bd623 minor clean. i think a lot of chaos has been reduced for today. we shall now rest. 2023-08-21 04:43:24 +00:00
Andrej Karpathy
3868f732a4 and finally refactor the Sampler. things are starting to look a lot cleaner I think 2023-08-21 04:23:02 +00:00
Andrej Karpathy
8a377a1d31 refactor the Transformer (Config, Weights, RunState) into a single object, with build and free too 2023-08-21 03:55:12 +00:00
Andrej Karpathy
ae2e4f8d88 name the tokenizer methods cleaner: encode and decode 2023-08-21 03:11:54 +00:00
atamyrat
0dd82158f6 removed transformers from requirements.txt, added error message 2023-08-21 06:07:29 +03:00
atamyrat
155475a523 Fix WQ and WK permutation in huggingface models 2023-08-21 05:16:11 +03:00
atamyrat
d7704bdeaa mark ModelArgs.hidden_dim as optional and calculate as previously if not provided 2023-08-21 03:40:34 +03:00
atamyrat
09db52c69e Added huggingface model loader to export.py 2023-08-21 02:59:12 +03:00
Harry Gifford
a72b3b0206
Update readme with suggestion on number of threads to use
Update the documentation to make suggestions on the number of threads. The performance difference can be very large. Also linked to the PyTorch docs which are relevant here.
2023-08-20 15:01:33 -07:00
Andrej Karpathy
c74456f3f0 refactor step 1. the tokenizer, and all the other abstractions, are a total mess, refactoring things a bit 2023-08-20 18:18:23 +00:00
Andrej Karpathy
1e335a41cf remove freq_cis fields as they are not used anymore 2023-08-20 17:26:43 +00:00
Andrej Karpathy
c0511de617 probindex should never have been part of RunState. i apologize for this failure of abstraction 2023-08-20 17:18:06 +00:00
Andrej
8c93c7a30e
Merge pull request #322 from karpathy/feature/export
New model export (the code remains "dead" and legacy version is still the default behavior, so no breaking changes are introduced). The major benefit is a new export.py file, which we can use to centralize work on formatting: both imports and exports.
2023-08-20 10:08:32 -07:00
Andrej Karpathy
13dcee493a todos update 2023-08-20 17:02:22 +00:00
Andrej Karpathy
f3db92a2dc use out_file.tell() instead of nbytes += arithmetic 2023-08-20 16:51:35 +00:00
Vulcan
f7a7ed94c8 Merge remote-tracking branch 'upstream/master' 2023-08-20 17:19:07 +05:30
Vulcan
bbb7b0e365 Update README.md 2023-08-20 14:37:23 +05:30
Andrej Karpathy
fa8dfd854e isolate read_checkpoint, because i'd like to now make it support both version 0 and version 1 2023-08-19 19:21:12 +00:00
Andrej Karpathy
4df5e2e939 make version 1 be the legacy export but with new header. version 2 will be Q8_0 export 2023-08-19 18:51:32 +00:00
Andrej Karpathy
4212bd6d43 oops fix double indent on quantize def 2023-08-19 18:34:49 +00:00
Andrej Karpathy
7f551dbfd7 new model export: versions 0 (legacy) and 1 2023-08-19 18:25:20 +00:00
Andrej
6c5d78fa41
Merge pull request #317 from yiminghan/yhan/old
Add a link to Dart port in README
2023-08-19 10:01:08 -07:00
Andrej
db1a722816
Merge pull request #318 from rahoua/master
YARP - Yet Another Rust Port in README.md
2023-08-19 10:00:56 -07:00
Andrej
d2a546c577
Merge pull request #319 from RahulSChand/warning
Give better error message in Tinystories data loader
2023-08-19 10:00:27 -07:00
rahulschand
fbefeec1b1 add assert message to give better warning 2023-08-19 13:05:26 +05:30
rahoua
978c311b30
Add pecca-rs to README.md 2023-08-18 14:58:21 -07:00
YiMing Han
882e480bc0 update read me 2023-08-18 15:18:29 -04:00
YiMing Han
d09ebbb32b Revert "working one"
This reverts commit 8607b11ea1.
2023-08-18 15:14:08 -04:00
YiMing Han
bc7cb7d0e8 Revert "only dart"
This reverts commit 01df3731d6.
2023-08-18 15:13:59 -04:00
YiMing Han
01df3731d6 only dart 2023-08-18 15:09:24 -04:00
YiMing Han
8607b11ea1 working one 2023-08-18 15:07:41 -04:00