I love that OrangePi is making good hardware, but after my experience with the OrangePi 5 Max, I won’t be buying more hardware from them again. The device is largely useless due to a lack of software support. This also happened with the MangoPi MQ-Pro. I’ll just stick with RPi. I may not get as much hardware for the money, but the software support is fantastic.
Disappointing on the NPU. I have found it's a point where industry wide improvement is necessary. People talk tokens/sec, model sizes, what formats are supported... But I rarely see an objective accuracy comparison. I repeatedly see that AI models are resilient to errors and reduced precision which is what allows the 1 bit quantization and whatnot.
But at a certain point I guess it just breaks? And they need an objective "I gave these tokens, I got out those tokens". But I guess that would need an objective gold standard ground truth that's maybe hard to come by.
just try to find some benchmark top_k, temp, etc parameters for llama.cpp. There's no consistent framing of any of these things. Temp should be effectively 0 so it's atleast deterministic in it's random probabilities.
Right. There are countless parameters and seeds and whatnots to tweak. But theoretically if all the inputs are the same the outputs should be within Epsilon of a known good. I wouldn't even mandate temperature or any other parameter be a specific value, just that it's the same. That way you can make sure even the pseudorandom processes are the same, so long as nothing pulls from a hardware rng or something like that. Which seems reasonable for them to do so idk maybe an "insecure rng" mode
I love that OrangePi is making good hardware, but after my experience with the OrangePi 5 Max, I won’t be buying more hardware from them again. The device is largely useless due to a lack of software support. This also happened with the MangoPi MQ-Pro. I’ll just stick with RPi. I may not get as much hardware for the money, but the software support is fantastic.
Disappointing on the NPU. I have found it's a point where industry wide improvement is necessary. People talk tokens/sec, model sizes, what formats are supported... But I rarely see an objective accuracy comparison. I repeatedly see that AI models are resilient to errors and reduced precision which is what allows the 1 bit quantization and whatnot.
But at a certain point I guess it just breaks? And they need an objective "I gave these tokens, I got out those tokens". But I guess that would need an objective gold standard ground truth that's maybe hard to come by.
just try to find some benchmark top_k, temp, etc parameters for llama.cpp. There's no consistent framing of any of these things. Temp should be effectively 0 so it's atleast deterministic in it's random probabilities.
Right. There are countless parameters and seeds and whatnots to tweak. But theoretically if all the inputs are the same the outputs should be within Epsilon of a known good. I wouldn't even mandate temperature or any other parameter be a specific value, just that it's the same. That way you can make sure even the pseudorandom processes are the same, so long as nothing pulls from a hardware rng or something like that. Which seems reasonable for them to do so idk maybe an "insecure rng" mode