Add selectable tokenizer supports on Ooba (#281)

# PR Checklist - [ ] Did you check if it works normally in all models? *ignore this when it dosen't uses models* - [ ] Did you check if it works normally in all of web, local and node hosted versions? if it dosen't, did you blocked it in those versions? - [ ] Did you added a type def? # Description I write simple changes on code, which allow user to choose tokenizers. As I write on https://github.com/kwaroran/RisuAI/issues/280, differences in tokenizers makes error when use mistral based models. ![image](https://github.com/kwaroran/RisuAI/assets/62899533/3eb07735-874f-46d0-bc0c-c92a32ef927b) As I'm not good at javascript, I simply implement this work by write name of tokenizer model, and select one on tokenizer.ts file. I test it on my node RisuAI and I send long context to my own server. ![image](https://github.com/kwaroran/RisuAI/assets/62899533/5b1f22a0-5b1b-4472-a994-bfe5472ba159) As result, ooba returned 15858 as prompt tokens. ![image](https://github.com/kwaroran/RisuAI/assets/62899533/6d4c2185-07c9-4de1-8460-0983b6e45141) And as I test on official tokenizer implementations, it shows 1k differences between llama tokenizer and mistral tokenizer. So I think adding this option will help users use oobabooga with less error.
2024-01-06 19:16:58 +09:00
parent 8ba0d10d56 ff4c67b993
commit cba3ff802c
3 changed files with 15 additions and 2 deletions
--- a/src/lib/Setting/Pages/OobaSettings.svelte
+++ b/src/lib/Setting/Pages/OobaSettings.svelte
@@ -61,6 +61,8 @@
            <OptionalInput marginBottom={true} bind:value={$DataBase.reverseProxyOobaArgs.chat_instruct_command} />
        {/if}
    {/if}
+    <span class="text-textcolor">tokenizer</span>
+    <OptionalInput marginBottom={true} bind:value={$DataBase.reverseProxyOobaArgs.tokenizer} />
    <span class="text-textcolor">min_p</span>
    <OptionalInput marginBottom={true} bind:value={$DataBase.reverseProxyOobaArgs.min_p} numberMode />
    <span class="text-textcolor">top_k</span>