Sunday, September 9, 2018

Zipf's law: causes & derivable from Pareto

I specified the relationship between Pareto and Zipf distributions in wikipedia:


Wikipedia's excuse for Zipf's prevalence:


Preferential Attachment
A simple, common, and possibly accurate view is that there is a preferential attachment ("the rich get richer") going on, e.g. people are more strongly attracted to large cities because there are more opportunities. A common math example for preferential attachment is the Yule process.

Yule process
This is one of the best contenders for explaining Zipf's law and other Pareto (power law) distributions. In biology, the number of species in a genus (or members of a specie?) seems to follow a Yule process that results in a Pareto (power law) in its long tail. This process says a new specie is more likely to form in proportion to the number of species in that genera. It's a simple "the big get bigger in proportion to their size".  It assumes that no species die out. If that condition were included, its tail probably dies off quicker, which is seen in realistic data. Overall, this process gives a hump, or at least a dip at the front end, that is commonly seen in real-world data in their log-log plots. Zipf's law is rho = 0.

Deviation of Front and Tail from straight-line log-log plots
A simple power law such as Pareto (the continuous form of Zipf's) is a straight line on a log-log plot.  But most real data forms a hump, possibly more than the Yule process.  So the front and tail ends dip down from the straight line. The preferential attachment idea is amenable to this: after cities get beyond a certain size, there are drawbacks. If it's not beyond a certain size, there's no benefit to from the limited ability to cooperate.  If a word is used too often, it's not conveying information.  If  a word is used too rarely, no one understands it.  So a double Pareto has been suggested, to cover the front and back tails, but it gives too much of a sharp hump in the middle, so it seems a triple Pareto (power law) would be better. But in many cases, a primary Pareto with a secondary Pareto for the head or tail adjustment might be close enough to the best possible.

BTW,  Zipf-Madelbrot is simply a slightly more general form of Zipf's law, throwing in another constant that might be used to create a more general Pareto distribution (continuous form) by scaling it in a way that results in a CDF =1 at infinity.

This chapter  by a physicist is excellent.

This paper says Zipf law works because it is half way between order and disorder.  It normalized entropy per character is 1/2, half the maximum possible. (At N=100 it's 0.37 and at N=1,000 it's 0.44, using  normalized entropy = H(CDF)/log(N) in a spreadsheet).


Other possible explanations for Zipf's law
Zipf's law usually refers to systems that have an exponent of s = 1. The reason for this simple 1/rank is considered a mystery for a long time. A lot of theories have been proposed (two possibilities are mentioned above, but I don't know if they were s = 1).

An IEEE article (with more comments here) claims many experts say Metcalf's law should be that the "value" of the network is N*log(N) instead of N^2 and that this gives rise to Zipf's law. N^2 assumes the value of every additional connection is the same with every node connected to every other node. Log(N) allows loss in efficiency of the connections as the number of nodes increases. They point out the harmonic sum 1+1/2+1/3+ ... 1/N =~ ln(N) + 0.577. This is not surprising since the integral of 1/x is ln(x).So the network gives ln(N)+0.577 value to each of the N nodes, so the total network value is N*[ln(N)+0.577]. But the node math can't be converted (at least directly) to words and city populations because the nodes have an equal number of connections and the same distributions.  Maybe ideas (for words) or occupations (for city populations) could be treated like nodes. Maybe cities or words could be treated as different networks.



Tuesday, September 4, 2018

Analaogy between Gravity/Momentum and Magnetism/Electrostatics

There is a bewildering array of different ways to find mathematical analogies between mechanics and electromagnetism. Feynman discusses analogues to electrostatics, stating at least that part of the analogies is possibly merely the result of everything needing to be calculated in terms of space, and since vector math is the best way to deal with it, the equations will naturally come out similar. (The vector math of curl, divergence, and gradient should show up a lot.) For a simple example, the 1/R^2 rule shows up a lot. It is the strength of something at a distance R from a point source of some quantity.  The source is emitting "rays" of some sort that spread out as you get further from the source. If the quantity we're interested is proportional to the number of rays per surface area of an imaginary sphere of radius R, then we have a 1/R^2 rule.  It shows up in gravity, electrostatics, sound intensity, and the probability P per second of getting hit by a bullet from a madman (point source) firing N bullets per second in completely random 3D directions. 

Wikipedia has a bunch of analogies between mechanical and electrical systems but the most natural one seems to be the first one mentioned, the Impedance_analogy. It's presented as 4 different substitutions, but it looked like there should be a simple source of the relationships.  I noticed simply replacing charge movement (current)  with "meters movement" (velocity) would derive the relationships.  Replacing charge with meters is a bizarre idea which could explain why respectable sources do not seem to mention this connection.

Two basic electricity equations are:

V = L di/dt
i = 1/J dV/dt

V = volts
L = inductance
i = q/second
q = charge
1/J = C = capacitance

The charge => meters substitution gives valid analogous relationships:

F = M dv/dt
v = 1/K dF/dt

F = force
M = mass
v = x/second
x = meters
K = Hook's law

These equations extend to valid energy equations:
E = 1/2 J q^2  (capacitive energy)
E = 1/2 K x^2  (spring energy)
E = 1/2 L i^2  (inductive energy)
E = 1/2 M v^2 (kinetic energy)

Separating a capacitor's plates by distance x does not result in the spring equation. The analogy works because as a charge q is moved the distance across a capacitor's dielectric, there's not only a V force it's resisting, but it's presence once it's there increases the voltage for future charges trying to move against it. 

The most interesting possibility is a deep parallel between L and M because L is just the result of how charge flows. It creates a self-interacting magnetic field and magnetism (in a non-quantum world) can be derived from q and relativity, i.e. L is not a thing in and of itself in the way we normally think of mass being something "real". L is just the result of q being forced to flow in a self-interacting way  We know the kinetic energy is also the result of relativity increasing the mass as velocity increases.  Is mass just as fictitious as L? It's interesting that inductors are in the same shape as a spring. Is mass in some sense have a capacitor shape? We know in relativity the distance shortens in the direction of travel. This leads to the idea than areas of like charge are like capacitor plates being pushed together and this is the source of how extra energy is being stored when you accelerate a mass. Inertia would just be the force needed to push like charges closer together, although the plate view is not supposed to be the correct view because as I mentioned capacitor energy is not the result of plates pushed together. 

The above are the integrals over either a charge or distance. In other words, these are the derivatives of the above:

V = J*q
F = K*x
magnetic flux = L*i
momentum = M*v

J = 1/C is the difficulty with which q can move onto the capacitor, and K is the difficulty with which "distance can move onto a spring".  Similarly Mass and inductance are difficulty to increasing v and i.

i is to magnetic flux what v is to momentum. We think of pushing charges through inductors, so maybe we should think of moving meters through mass instead of moving a mass through meters.   

Two more equations to point out:

E = F*x
E = V*q

Electrical: Moving charges are confined in a spatial arrangement that we call an inductor. If we try to accelerate them, we encounter a push-back. If we overcome the push-back so that they move faster, it will cause the inductor to have a higher internal energy. (The inductor can be thought of as a superconducting  toroidal type.

Mechanical: "Moving meters" are confined (via charges?) in a an arrangement  we call mass. If we try to accelerate them, we encounter a push-back from their "self-inductance". If we overcome the push-back so that they move faster, it will cause the mass to have a higher internal energy.

To go deeper, I would like to insert v in place of i in Maxwell's equations because Einstein said Maxwell's equations are a great filter for weeding out false theoretical ideas because all relativistic ideas are subject to them. But a straight substitution into Maxwell's equations would seem to have a problem. Maxwell's equations are all about spatial relationships. Throwing in an extra spatial dimension to replace charge seems drastic. Do I exchange an x and q instead of replace the q?

Notice in the all the charge equations above, meters were not present except implied in say J or L.

Will it require somehow replacing the 3D spatial system of the equations with some kind of 3D charge system?  To be clear, this means there would be some sort of 3D "charge-space".  I was once told spatial dimensions are the result of quantum spin and I know spin is at the charge level, so I did a quick Google search for "spin charge" which showed electrons consist of 3 quasi-particles. Are the 3 quasiparticles related to charge in a way that is analogous to how the 3 spatial dimensions' are related to mass? 

Maxwell's equations can be derived from the idea of point charge emitting rays of force in 3D space, and them using relativity to generate the magnetic pair of equations. Magnetism = a relativistic effect of charges (Feynman and Schwartz cover this, but I didn't learn it from school, but came across it in a 1930's Encyclopedia Britannica before looking for it elsewhere). Magnetism is perpendicular to the movement of charge in space.