Decoding the Success of Gzip + KNN: The Central Role of LZ77
This is all quite true, but tokenization is a very small part of AI, and not the cutting edge path to broad knowledge AI assistants.
Also context is very important which is part of the vector-database where all tokens are vectorized and associated so that sentences can have meaning.
The path of tokenization versus vectorization is largely dependent up on the problem.
If your just translating a lot of words, then tokenization, and perhaps compressed tokenization is most useful, especially for running AI on mobile phone apps as predicted in near future, any means of compressing the input stream and saving memory is valuable
On the other hand if your comparing documents, or ask the AI about a deep and long subject with memory the vector-database methods are far better because the meaning of the words is not lost, rather than just enumerating tokens whether as words or syllables
Sorry for responding, but this 'all you need' is quite common for something like +5 years now, and nothing is all you need, like the swiss army knife use the right tool for the job; When your only tool is a hammer, all looks like a nail, and compressed tokenization is not all you need;
I don't know if your the github author, but a useful stat on this article is to show the total memory used for each case, so people can clearly see which is best for compressing the stream, also since your using FP32/64, you might want to consider FP8/16 depending on depth of tokens to achieve max compression
I personally prefer two dimensional vector reps, but because I want meaning, e.g. two dimension can show you that man/woman are similar as both are humans, but a one dimensional tokenization may tell you that man/woman are close, but not the common family, for the AI beginner I would suggest just writing your own bag-of-words in python and understanding tokenization and be done with it, as the real meat of AI is in training the weights and doing the minimization of training aka linear-algebra;
I think topic good for advanced optimization, say a person who is trying to push a HUGE app onto a mobile-phone and needs to compress memory