On 6/12/14, 4:41 PM, Joel de Guzman wrote:
On 6/12/14, 2:45 PM, Thijs (M.A.) van den Berg wrote:
On Jun 12, 2014, at 2:30 AM, Joel de Guzman
wrote: I do not think a random distribution of number of digits is a good representation of what's happening in the real world. In the real world, especially with human generated numbers(*), shorter strings are of course more common.
A well known real world property is Benford's law, often used in fraud detection to check is numbers are fake or "natural".
If you draw random numbers uniformly from the logarithmic scale then you'll get that scale invariant property. I think that leads to a random number of digits?
http://en.m.wikipedia.org/wiki/Benford's_law#Mathematical_statement
That one is for the first digit only and not for the number of digits. Is it just a conjecture that single digits, for example, occur more frequently than say 1,000,000 digits? If that conjecture does not hold, then we should probably be using big nums all over! It's also a known *fact* that varint encoding gives the best performance compared to uniform encoding when transferring data over networks!
I'm not sure if there's a study of the probability of the occurrence of N digits, is there? Anyway, here's one:
http://mathematicalmulticore.wordpress.com/2011/02/04/which-numbers-are-the-...
Perhaps the math guys should set me straight and I would not be surprised if the answer is 42 again! :-)
Just for fun: google any single or double digit number (e.g. "1") and then google a many digit number (e.g. "2432345676"). For "1", I got 15,550,000,000 hits. For "2432345676", I got 9 hits. Regards, -- Joel de Guzman http://www.ciere.com http://boost-spirit.com http://www.cycfi.com/