from [/villagepump/フラクタル要約: Llama 3 Implementation](https://scrapbox.io/villagepump/フラクタル要約: Llama 3 Implementation) from FractalReader Operation Diary Fractal Summary: Implementation of Llama 3
- LLM Pricing
- It was super easy to implement using together.ai (blu3mo)(blu3mo)
- Since together.ai is compatible with the OpenAI API, by changing the endpoint to together.ai and renaming the model, Llama 3 70B can run
- English version: https://fractal-reader.com/view/d6db0839-81d1-4bd9-bfbb-1250fe163c88
- Japanese version: https://fractal-reader.com/view/475c8c5a-3096-4830-80cb-25edc33a2ade
- Hmm, maybe it’s not as bad as expected (blu3mo)(blu3mo)
- Probably failing at Level 1 due to exceeding the token limit
- Need to think about how to handle this
- The price is about 1/10th
- Well, upon closer inspection, some summary requests are failing
- There are cases where correct JSON is not being generated
- Perhaps gpt-4 is more stable in that regard
- There might be a mechanism where if Llama fails, the request goes to gpt-4-turbo
Trying Qwen 1.5 as well
- This one has a token input length of 30k so you can input longer text
- https://fractal-reader.com/view/23da3ea3-bce7-4909-bf68-363243f60437
- Some errors are present, but these might be fixed by adjusting the prompt
-
- Asked for Japanese but got Chinese instead…
- Qwen, while having high quality, doesn’t always follow the prompt instructions
- Also, the output is slow
- There’s a reason why cheap models are cheap, it’s quite challenging to control and has a lot of noise
- Once again realizing the strength of gpt-4-turbo
- Nevertheless, the quality of open models is impressive
- Want to Compare Low-Cost Open Models with GPT