Updated README to replace export_meta_llama_bin.py script with export.py

This commit is contained in:
atamyrat 2023-08-21 14:24:01 +03:00
parent 36a78af5e1
commit 61c26d5392

View File

@ -65,10 +65,10 @@ Quick note on sampling, the recommendation for ~best results is to sample with `
## Meta's Llama 2 models
As the neural net architecture is identical, we can also inference the Llama 2 models released by Meta. Sadly there is a bit of friction here due to licensing (I can't directly upload the checkpoints, I think). So Step 1, get the Llama 2 checkpoints by following the [Meta instructions](https://github.com/facebookresearch/llama). Once we have those checkpoints, we have to convert them into the llama2.c format.
For this we need to install the python dependencies (`pip install -r requirements.txt`) and then use the `export_meta_llama_bin.py` file, e.g. for 7B model:
For this we need to install the python dependencies (`pip install -r requirements.txt`) and then use the `export.py` file, e.g. for 7B model:
```bash
python export_meta_llama_bin.py path/to/llama/model/7B llama2_7b.bin
python export.py llama2_7b.bin --meta-llama path/to/llama/model/7B
```
The export will take ~10 minutes or so and generate a 26GB file (the weights of the 7B model in float32) called `llama2_7b.bin` in the current directory. It has been [reported](https://github.com/karpathy/llama2.c/pull/85) that despite efforts. I would not attempt to run anything above 7B right now for two reasons: first, 13B+ currently doesn't work because of integer flow in pointer arithmetic, which is yet to be fixed, and second, even if it were fixed, this repo is doing float32 inference right now, so it would be fairly unusably slow. Once the export is done, we can run it:
@ -316,7 +316,6 @@ If your candidate PRs have elements of these it doesn't mean they won't get merg
## unsorted todos
- delete the export_meta_llama_bin.py and export_meta_llama_hf_bin.py files. instead, import both of these into a proper model.py Transformer instance, and then export using the export script as usual.
- migrate the code to work with the new versions export and deprecate the original .bin files
- support Llama 2 7B Chat models and tune run.c to Chat UI/UX
- make it easier to add a new dataset with not too much pain