Economics, A.I., physics, & evolution

Friday, May 6, 2016

Relation of Physical to Informational Entropy, no heat death of Universe

Shannon's entropy has units of "bits/symbol" or "entropy/symbol". Do a search on "per symbol" in Shannon's booklet. It's parallel in physical entropy is specific entropy, not total entropy. So to get "entropy" of a message you multiply by the number of symbols in the message: S=N*H. In physical entropy you multiply specific entropy times the number of moles or kg to get total entropy.

Physical entropy with Boltzmann's constant is fundamentally unitless. Temperature is defined as the average kinetic energy per particle. Is it is a measure of the undirectional kinetic energy. Directed kinetic energy is just a moving in the same direction, so that has no heat character like undirected kinetic energies within a system. Instead of measuring Kelvins, we could measure the average (root-mean-square) of the particles' kinetic energy in Joules, but for historical reasons we use kelivins. Boltzmann's constant would then be Joules/Joules. Heat energy = S*T = entropy * energy/particle. So entropy's units are fundamentally "disorder of particles" so that heat comes out to "disordered kinetic energy". Shannon's entropy S=N*H is "bit disorder". N=message length in symbols, H=disorder/symbol. Choosing the log base is the only remaining difference and that's just a conversion factor constant. By Landauer's limit physical entropy = N*k*ln(2) where N is in number of bits. Solve for N = minimum number of yes/no question needed to specify the momentum and position of every particle in a system.

The entropy of every large-scale comoving volume of the universe is constant. entropy is a conserved constant. See Weinberg's popular "The first 3 minutes". Only in engineering is the statement "entropy always increases" for an isolated system. Feynman states it is not an accurate description. It's just words. He gives the precise equation which does not require it to "always increase". The problem is that there is no such thing as an isolated system in the universe. A box at a temperature emits radiation. "The heat death of the universe" is not a physical or cosmological theory and goes against astronomical observations. It is an engineering statement based on non-existent idealized isolated systems.
=======
follow up email

Physical entropy is S=k*log(W). There are equations for this quantity. Unlike information entropy, physical entropy is very precise, not subject to semantics.

Specific physical entropy can be looked up in tables for various liquids, solids, and gases at standard condition. You multiply it by the number of moles in your beaker to get physical entropy.

The kinetic energy of a particle that contributes to temperature depends only on translational energy but that is constantly being affected by rotational or vibrational energies which are trying to be just potential energies like a flywheel and a swing. These potential energies make heat capacity different for different materials. If they did not exist, entropy would be a very simple calculation for all materials that depended only on the spatial arrangement of the atoms and their momentum (i.e. temperature). Physical entropy is a difficult concept primarily because of the potential energies getting in the way.

There's something interesting about life and entropy: life may not be reducing entropy on Earth, but when we create stronger structures, the atoms are in more rigid positions which reduces the entropy. So the structures we call life have a real entropy that is lower than the non-life ingredients from which we came. This extends to our machines: solar cells, electrical motors using metals to give electron movement, carbon fiber structures, nanotubes, steel buildings, steel cars, and CPU's all have two things in common: 1) we had to remove oxygen from ores to get materials that have 10x lower entropy per mass of their original state 2) acquiring energy (solar cells) to move matter (electrical motors) to build strong structures for protection and to last a long time with a reliable memory and to think about how to repeat this process efficiently (CPUs) is what evolution is all about. DNA is an unbelievably strong crystal structure with very low entropy. Bones and teeth also. However, in making these low-entropy structures, we release oxygen and CO2 and as gases, they have a lot higher entropy, so it is not clear the entropy on Earth is being reduced. You can trace the fact that the things important to life have lower entropy so well that you have to wonder if lowering the entropy is the whole point of life. One other thing is that lower entropy, knowing what state something is in, means you have better command, control, and cooperation.

So my view is that life is the process of Earth cooling off like a snowflake forming. A HUGE factor in creating life on Earth is the MOON. Isaac Asimiov talked about this. The moon is exerting a cyclic force on the tides and mantle, and it's known interesting (non-random) things happen when a thermodynamic system is being affected by a non-random force. For example, loosely packing objects of different shapes (or even single shapes like spheres) has a looser packing arrangement than if you shake it up as you add the pieces, or just shake at the end. Being more compact as a result of the cyclical force is reduced entropy.

Another aspect of this (probably irrlevant) is that the moon is adding energy to the Earth. The frictional forces in tides and the mantle comes at a cost of lower and lower rotational energy in the Earth. The moon is going to a higher obit each year, so it is receiving energy and the Earth is going a little slower. Water molecules also lose rotational energy as the snowflake forms.

The Earth's seasons caused by the tilt that came from the happenstance collision has been important to creating life. The cyclic force of the moon has increased the number of concentrated ore deposits that economic life depends a great deal on. We are tapping into the lower entropy created by the moon. Our structures are less entropy, but we increase external entropy not only by the gases, but by spreading out what was concentrated ores into our machines that are all over the place, although the gravitational energies that dictate that positional entropy are small compared to the chemical bond changes.

So life appears to be no more than the physical dynamics of Earth "cooling off", just like a snowflake that results from cooling off. Net heat energy and entropy is emitted to the environment. The Earth radiates about 17 photons in random directions for every incoming unidirectional photon, which is a lot of excess entropy being emitted, giving the opportunity of lower entropy.

You're right to ask what entropy/symbol means. There is a large semantics barrier that engulfs and obscures Shannon's simple entropy equation. Shannon himself did not help by always calling it "entropy" without specifying he meant "specific entropy" which means is on a "per something" basis. Specific entropy is to entropy what density is to weight.

"entropy/symbol" means the logarithm base in Shannon's entropy equation H = sum of -p*log(p) is not being specified and it is being left up in the air as a reminder that it does not have to always be in bits, or it can mean the log base is equal to the number of unique symbols in the message. In this latter case it is called normalized entropy (per symbol) that varies from 0 to 1, which indicates a nice pure statistical measure. Using normalized entropy, any group of distinct symbols can be taken at face value and you can ask the question "how disordered per symbol is it relative to itself from 0 to 1". Otherwise entropy/symbol varies on how many symbols you claim the data has (is it 8 bits or 1 byte?) and on what log base you choose.

For example, binary data has 2 symbols and if the 0's and 1's occur with equal frequency then the Shannon H (specific) "entropy" of 128 bits of data is expected to be 1 bit/symbol. Since the log base I've chosen is 2, this is also its entropy/symbol. If you call the symbols bits, then it is bits/bits or entropy/bit. bits/bits = (bit variation)/(bit number of symbols) = variation/symbol = a true objective statistic. It's total entropy is S=N*H = 128 symbols*1 bit/symbol = 128 bits or 128 entropy. Similarly "mean" and "variance" as a statistical measures have no units.

Now let's say we can't see each bit, but only see the ascii characters that are defined by groups of 8 bits. We see 256 different types of characters and their are 128/8 = 16 of them in this particular message. We can't see or know anything about the 1's and 0's. But we can still calculate the entropy in bits. But since there are only 16 of them (let's say each symbol occurs once to keep it random), the straight entropy equation that is ignorant that there might be 256 of them gives log2(16) = 4 bits/symbol. This time it can't be called entropy/symbol without specifying bits, and the denominator is not bits either. But it is a Shannon entropy. The total entropy is then 4*16 = 64 bits. In other words, the data is now encoded (compressed) because I implicitly created a look-up table with 16 symbols and I can specify which symbol out of the 16 by 4 bits each. With only bits, my lookup table is 1 and 0 instead of 16 symbols. At a cost of memory to create the table, I was able to shorten the representation of the message. If the log base is 16, then log16(16) = 1 = 1 character/character = 1 entropy/character. To change log base from base 10 to any other base divide by the log of of the new base. log16(x) = log(x)/log(16)

But since we know there might have been 256 different characters, we could have chosen log base 256 in which case the Shannon H entropy = 0.5 bytes/symbol. This gives information entropy 0.5*16 = 8 bytes of entropy. The minimal amount of PHYSICAL entropy in a system that would be needed to encode this is 8*k*ln(2). See Landauer's limit.

I know the above is confusing, but the equation is simple. I just wanted to show how semantics really messes everyone's understanding of information entropy up.

There's an online entropy calculator that I recommend playing with to understanding information entropy if you're not going by the -p*log(p) equation and my above comments.

Thursday, May 5, 2016

This Brave New World

What we're seeing is hitting me like a ton of bricks even though it has always been right there in front of us. I watch things and think about my children's future. I feel stupid compared to this swamping of A.I. that surrounds us. I speak to a computer silicon chip on my phone to text. It weighs 2000 times less than my brain and it understands every word better than my kids. Its mechanical ear weighs about 2000 times less than my ear. The A.I. algorithms that power voice recognition are a very general model of the human cortex. All A.I. that is coming is based on these Bayesian rule, with modifications that are continually bringing them closer to the cortex. The new chips being developed using different physical structures will implement them 1,000,000 times faster than current chips.

Amazon is on pace to eliminate every local store. Banking and finance are looking at a massive reduction in employees. Automated cars will cost 30% less from not needing a human interface and will eliminate highway deaths, making it illegal to drive someday. Programmers are replacing all non-elite programmers and thinking workers. China is replacing factory workers with robotics.

Evolution will continue to favor efficiency which is the last thing a human worker is. But for the forseeable future, it looks like the only job kids will need to do is accept money printed by the government and buy things. Efficiency is increasing faster than the government is printing money. The population is slowing down and the efficiency is increasing. What are people going to be like without having to work? I would have thought "idle hands are tool of the devil" but crime has been dropping every year for 30 years.

Monday, May 2, 2016

Text Analysis of Satoshi Nakamoto, Nick Szabo, and Craig Wright

Newer investigation.

Nick Szabo does not appear to be Satoshi Nakamoto. 43 out of 56 times Satoshi said "maybe". 25 for 25 Szabo wrote "may be". Seven out of 7 times Satoshi wrote "optimisation" which is a non-American spelling. He also used colour, organise, analyse, and synchronise which Szabo did not.
Shinichi Mochizuki also used "may be on his web page several times, but not "maybe".
Ross Ulbricht and Hal Finney are/were American.
Vili Lehdonvirta uses British english, but Satoshi ranks an incredibly low 42 compared to him, and on the vice versa, he ranks a pretty low 19 when Satoshi is the baseline.
I could not find writings by Mike Clear.
The patent by Neal King, Vladimir Oksman and Charles Bry ranks Satoshi respectable 9, but still below Szabo and Wright.

I'm working on a windows executable of this and will list it here above the script when it is done.

To do a professional job of this, use open source SVMLight with its ranking option.

If you think someone else might be Satoshi, send me 100KB of his writing for a comparison.

I wrote the Perl program below to compare texts of various KNOWN authors to a single "MYSTERY" author. I originally tried Shannon entropy, K-L divergence, and Zipf's law, and various power and constant modifications to them, but the following came out the best. I also tried word pairs, word triples, variable-in-the-middle, and including punctuation and/or capitalization. Words pairs with punctuation could work almost as good. (p/q)^0.2 instead of log(p/q) worked about as good, as it seems to be a similar function. The ability to group into nouns, pronouns, verbs, etc while ignoring actual word and making it triples or more would be good as a different entropy calculation of the difference that would have enough data and be a completely different measure making summable to better accuracy (it's a different "dimension" of the author) but I do not have good word lists separated into those categories.

The equation
Sum the following for each word if count of a mystery word is greater than the known author word: log(count_mystery_words/count_known_words). Lowest score wins. Score indicates a naive difference in entropy between the different authors. This is a Shannon entropy difference aka K-L divergence which p*log(p/q) but it appears that because of Zipf's law, the p outside the log had a negative effect in identifying authors in my tests, so I left it out, as determined by experimentation. To prevent divide by zero, if count_known=0, then cont_known=0.25/number of words in text (half as likely to have had a 50% chance accidentally not being used). Experimentally the 0.25 was nearly optimal with 1 to 0.05 not having much difference. Not using the conditional and just running it straight had little effect, sometimes good, sometimes bad for up to 5%, but it saves computation. It only loops through the mystery words where known author words may be zero, but not vice versa. To get decent results by looping through the known authors, the words not used by the mystery author must be completely skipped which is very different from needing to assign a reasonable probability to the missing words in the mystery author loop. This is explained below.

The Huckleberry Finn Effect
Looping though all mystery words but not vice versa has an interesting effect. If you run it on "Huckleberry Finn", it is easy to see he is correlated with Mark Twain. But if you run it on other texts by Mark Twain as the mystery text, Huckleberry Finn is nearly LAST (see sample below). The larger the text sampled, the bigger the effect because Huck's vocabulary was less than all the other authors who started matching better with Twain in large samples. This method is saying Huckleberry Finn is Mark Twain, but Mark Twain is not Huckleberry Finn. Twain wrote Huckleberry Finn in 1st person as an endearing and uneducated but smart kid with a limited vocabulary. Finn was not able to get away from the simple words Twain is biased towards, but Twain was able to use words Finn did not know more like other authors. So there is at least one way authors may not be disguising themselves as well as they think. Except for word pairs: Huck jumped up higher when word pairs were treated like words. Treating common punctuation as words was needed to make word pairs as good as single words, and single word treatment seemed to do worse when treating punctuation as words.

If Satoshi was focused on the task at hand, having to repeat himself, it is a lot different than Szabo writing complex educational articles with a large vocabulary and pronouns.. Satoshi matches with Szabo well, but Satoshi is not as well of a match with Szabo. Combing them gives Satoshi rank #2 behind "World is Flat" which has a much more common general language than the others that rank high when running it one way. Szabo with his larger language was more like a larger number of other authors because of Satoshi's restricted task.

In summary, it is one-sided because it only compares words when mystery words were more common than known author. I could not get better results any other way on the test runs (always excluding Satoshi, Wright and Szabo). Apparently, it needs to claim mystery author could have been like any of known authors if he had wanted but not vice versa. It is measuring degree to which known authors are not mystery author, but not vice versa. People are complex and on some days mystery author could have been like any of the 60 known authors, but it is less likely that any particular one of the known authors was trying to be like the single mystery author on this particular day. Almost 1/2 the data isbeing thrown out. I was not able to find a way to use of it.

Which words give an author away?
All of them. It measures how much two texts are not similar. This is a long list of words for all authors. The similar texts have a longer list of words with a similar frequency and a shorter list of the non-similar frequencies.But they're both long lists, with each word carrying a different weight based on log(countA/countB). The counts are proportionally the same as log(freqA/freqB).

Description of the test data
All files were at least 250 KB. The Craig Wright file does not include his recent 5 Bitcoin "blog" posts or his DNS pdf paper. The Satoshi file is emails, forum posts, then the white paper. I deleted any paragraphs that looked heavy on code talk Szabo's files are from his blog going back to 2011 and the 6 or so links from that blog back to his older articles. I made sure to not include his "Trusted Third Parties" article which is a dead ringer for Satohi's white paper and his bitcoin-centric "Dawn of trustworthy computing". There were also 4 paragraphs in recent papers that mentioned bitcoin and I removed them. "Crypto" appears only 3 times in the remaining Szabo sample file. Obvious and common bitcoin-related words can be removed from all files and it usually has no effect, and yet any bitcoin-centric papers will have a large effect..

Craig Wright jumps to top if I include his 5 new bitcoins articles. If sample size is reduced below 150k, Wright gets ahead of Szabo. This indicates it's not a very accurate program. Anyone with a strong interest in the same area as Satoshi probably has a 50% chance of ranking higher than Szabo.

A couple of interesting notes
Doesn't last name come first in Japan so his nickname should be Naka Sato? Craig Wright said "bit coin" instead of "bitcoin" about 4 out of the 5 times he mentioned it in his 2011 text. Satoshi never did that. So human smarts can give a better ranking.

Author Comparison

mystery text: satoshi_all.txt 240000 bytes.
known directory: books
Words to ignore:
Using only first 240000 x 1 bytes of known files

First 42991 words from mystery text above and known texts below were compared.

1 = 699 szabo.txt
2 = 703 What-Technology-Wants.txt _1.txt
3 = 710 superintelligence_0.txt
4 = 716 world_is_flat_thomas_friedman_0.txt
5 = 719 What-Technology-Wants.txt _0.txt
6 = 722 Richard Dawkins - A Devil's Chaplain.txt
7 = 723 superintelligence_1.txt
8 = 724 brown - web of debt part B.txt
9 = 724 craig_wright.txt
10 = 727 SAGAN - The Demon-Haunted World part A.txt
11 = 732 Steven-Pinker-How-the-Mind-Works.txt
12 = 735 crash_proof.txt
13 = 737 SAGAN - The Demon-Haunted World part B.txt
14 = 738 ridley_the_rational_optimist part B.txt
15 = 738 craig wright pdfs.txt
16 = 740 world_is_flat_thomas_friedman_1.txt
17 = 740 rifkin_zero_marginal_society.txt
18 = 742 ridley_the_rational_optimist part A.txt
19 = 747 Steven-Pinker-The-Language-Instinct.txt
20 = 752 Justin Fox - Myth of the Rational Market2.txt
21 = 755 Rifkin J - The end of work.txt
22 = 763 HEINLEIN THE MOON IS A HARSH MISTRESS.txt
23 = 764 RIDLEY genome_autobiography_of_a_species_in_23.txt
24 = 771 SAGAN - The Cosmic Connection (1973).txt
25 = 771 SAGAN_pale_blue_dot.txt
26 = 773 Richard Dawkins - The Selfish Gene.txt
27 = 783 SAGAN-Cosmos part A.txt
28 = 783 brown - web of debt part A.txt
29 = 784 HEINLEIN Stranger in a Strange Land part B.txt
30 = 785 HEINLEIN Stranger in a Strange Land part A.txt
31 = 786 RIDLEY The Red Queen part A.txt
32 = 789 SAGAN-Cosmos part B.txt
33 = 792 foundation trilogy.txt
34 = 794 SAGAN The Dragons of Eden.txt
35 = 794 GREEN - The Elegant Universe (1999).txt
36 = 795 GREEN The Fabric of the Cosmos.txt
37 = 796 minsky_emotion_machines.txt
38 = 796 RIDLEY The Red Queen part B.txt
39 = 805 HEINLEIN Citizen of the Galaxy.txt
40 = 811 HEINLEIN Starship Troopers.txt
41 = 813 wander.txt
42 = 817 how to analyze people 1921 gutenberg.txt
43 = 820 twain shorts.txt
44 = 821 HEINLEIN Have Space Suit.txt
45 = 822 twain roughing it part B.txt
46 = 826 works of edgar allen poe volume 4.txt
47 = 831 freud.txt
48 = 836 feynman_surely.txt
49 = 839 twain roughing it part A.txt
50 = 840 twain innocents abroad part A.txt
51 = 843 The Defiant Agents - science fiction.txt
52 = 845 moby-dick part B.txt
53 = 846 twain innocents abroad part B.txt
54 = 846 dickens hard times.txt
55 = 847 samuel-butler_THE WAY OF ALL FLESH.txt
56 = 847 Catch 22.txt
57 = 850 Finnegans wake.txt
58 = 851 Ender's Game.txt
59 = 854 moby-dick part A.txt
60 = 854 the social cancer - philipine core reading.txt
61 = 856 J.K. Rowling Harry Potter Order of the Phoenix part A.txt
62 = 859 dickens oliver twist part A.txt
63 = 861 dickens tale of two cities.txt
64 = 862 J.K. Rowling Harry Potter Order of the Phoenix part B.txt
65 = 867 ivanhoe.txt
66 = 870 twain - many works.txt
67 = 871 dickens oliver twist part B.txt
68 = 884 dickens david copperfield.txt
69 = 887 AUSTIN_pride and predjudice.txt
70 = 890 don quixote.txt
71 = 896 AUSTIN_sense and sensibility.txt

Perl code:

Examples of its accuracy on 240kb (bottom ~55 files not shown)

SAGAN The Dragons of Eden.txt

43149 words
1 = 41227.33 = SAGAN - The Demon-Haunted World part A.txt
2 = 41449.17 = SAGAN - The Demon-Haunted World part B.txt
3 = 42541.09 = SAGAN - The Cosmic Connection (1973).txt
4 = 43153.35 = What-Technology-Wants.txt
5 = 43544.19 = SAGAN-Cosmos part B.txt
6 = 43801.02 = Richard Dawkins - A Devil's Chaplain.txt
7 = 44435.02 = SAGAN_pale_blue_dot.txt
8 = 44544.71 = RIDLEY genome_autobiography_of_a_species_in_23.txt
9 = 44608.56 = Steven-Pinker-The-Language-Instinct.txt
10 = 44721.12 = Steven-Pinker-How-the-Mind-Works.txt
11 = 44805.36 = SAGAN - Contact.txt

twain innocents abroad part A.txt
1 = 49152.42 = twain innocents abroad part B.txt
2 = 52359.88 = twain shorts.txt
3 = 52479.89 = twain roughing it part A.txt
4 = 52761.57 = twain roughing it part B.txt
5 = 56852.54 = twain1.txt
6 = 57091.27 = moby-dick part A.txt
7 = 57402.24 = twain4.txt
8 = 57414.97 = works of edgar allen poe volume 4.txt
9 = 57454.54 = twain2.txt
10 = 57494.54 = moby-dick part B.txt
11 = 58166.21 = the social cancer - philipine core reading.txt
12 = 58468.89 = twain - many works.txt
....
60 = 66056.21 = twain huckleberry_finn.txt WEIRD, nearly last

twain huckleberry_finn.txt
1 = 579.23 = twain huckleberry_finn_0.txt
2 = 27604.16 = twain - many works.txt
3 = 29175.39 = twain roughing it part B.txt
4 = 29268.58 = twain roughing it part A.txt
5 = 29467.46 = twain shorts.txt
6 = 30673.77 = twain2.txt
7 = 30721.36 = moby-dick part A.txt
8 = 31124.08 = twain innocents abroad part A_0.txt
9 = 31124.08 = twain innocents abroad part A.txt
10 = 31465.82 = twain4.txt
11 = 31548.63 = twain innocents abroad part B.txt
12 = 31631.72 = Finnegans wake.txt
13 = 32100.17 = twain1.txt

dickens oliver twist part A.txt
1 = 32084.14 = dickens oliver twist part B.txt
2 = 35266.22 = dickens tale of two cities.txt
3 = 35764.28 = works of edgar allen poe volume 4.txt
4 = 36244.83 = dickens hard times.txt
5 = 36361.77 = twain shorts.txt
6 = 36497.48 = moby-dick part A.txt
7 = 36533.16 = twain roughing it part A.txt
8 = 36781.17 = dickens david copperfield.txt

HEINLEIN Starship Troopers.txt
1 = 37785.89 = HEINLEIN Stranger in a Strange Land part A.txt
2 = 38258.97 = HEINLEIN Stranger in a Strange Land part B.txt
3 = 38914.36 = HEINLEIN Citizen of the Galaxy.txt
4 = 39771.37 = HEINLEIN THE MOON IS A HARSH MISTRESS.txt
5 = 40128.47 = twain shorts.txt
6 = 40552.46 = HEINLEIN Have Space Suit.txt

What Sherlock Holmes, Einstein, Heisenberg, the semantic web, the Null Hypothesis, Atheists, Scientists have in common
Short answer: like this equation, they demonstrate the usefulness of deducing the unlikely or the impossible. They measure what is not true, not what is true. Ruling out the impossible does not mean the alternatives you are aware of is the truth. Only an omniscient being can know all the alternative possibilities. So unless you're a God, you can fall victim to this Holmesian Fallacy in reasoning. Physics is also based on the null hypothesis, declaring what is not true rather than truths: Einstein said thermodynamics and relativity are based on simple negative statements: there is no perpetual motion machine and nothing can go faster than light. Heisenberg's uncertainty principle similarly states a negative, that it is impossible to state both position and momentum, or time and energy. The null hypothesis and "not believing" in science is a formal embodiment of this approach. Sherlock Holmes falls victim to the fallacy named after him in the same way the null hypothesis in drug screening can victimize innocents "When you have eliminated the impossible, whatever remains, however improbable, must be the truth" - Sherlock Holmes. The problem is that the truth includes things you can't imagine, error in measurement, or other compounds that have the same chemical reaction. Why is the null hypothesis so useful? Because reality is bigger and more complex than we can imagine, bigger than our physics can describe. Likewise, people and their writing is so diverse, it is a lot easier to measure the degree to which two texts are different. The semantic web absolutely depends on fuzzy definitions to be able to describe a world that is bigger than reason. See Godel's Theorem and why mathematicians have trouble walking and talking.

Sunday, May 1, 2016

Notes on postural tachycardia

Postural orthostatic tachycardia syndrome (POTS) is manifested by a significant increase in heart rate (an increase of more than 30 bpm or a heart rate of 120 bpm or more) during postural challenge without a fall in blood pressure (it can be quite variable). The mechanism of POTS is incompletely understood and is associated with physical deconditioning, and it usually happens in young females.

In Europe, chronic fatigue syndrome (CFS) is called myalgic encephalomyelitis (ME). Orthostatic intolerance is a broad title for blood pressure abnormalities such as neurally mediated hypotension (NMH) and POTS. Orthostatic intolerance is a symptom of CFS.Patients with CFS have findings similar to the findings in patients with POTS during the head up tilt table test., Patients with POTS may also have fatigue as a prominent clinical feature. Orthostatic tachycardia and autonomic abnormalities are present in both conditions, which is qualitatively difficult to differentiate.

Patients should be advised to take aerobic exercise on a regular basis so that venous return from the lower extremities can be increased.21 Patients with dysautonomic syncope can be advised to wear graded compressive hosiery extending up to the waist, thus helping to increase static pressure at the calf and decrease venous pooling. A high fluid intake should be encouraged with at least 3–5 g of common salt.

Saturday, April 30, 2016

Major problem with constant quantity coin like bitcoin

Attempted RECAP:

If savers of a constant quantity coin loan it out like dollars (capital investment or whatever), and if it is the world's default currency like the dollar, major problems will arise if they save more than they spend.

First consider the problem if more people simply save and use the coin more in transactions, without loaning it out. The number of coins usable for transactions dries up. Even if Keynesian economics does not apply (wages and prices were not sticky) there is a problem. Or rather, wages and prices are sticky because prices and wages will have to decrease in terms of the currency. Long-term contracts could not be written in terms of the coin. Stability of prices and wages does not result from a constant-quantity currency, but from an expanding coin (and contracting in a downturn). This problem is also the appeal of a constant-quantity coin to savers. They get more than their initial purchase for nothing. A deflating coin does not benefit everyone, it only means past savers get more financial power over future workers and over future savers to a degree that is GREATER than the work they provided in order to make the initial purchase.

But the situation is MUCH worse if they DID invest in society and obtain interest in return for loans. The key problem is that loans are normally made at a rate that is greater than the productive capacity increase in society that is supposed to result from the loan. Even loans with an expanding currency lead to an exponential increase in savings. A shilling loaned out since Jesus at 6% would be worth a block of gold the size of the solar system, minus uranus and pluto... back in 1769.

Savers of a constant quantity coin become excessively powerful over late-adopters and non-savers if they loan it out at interest. At some point, workers will abandon the coin in favor of another coin, which is, in effect, enforcing an inflating coin. Workers need an inflating coin that inflates to keep up with the coin's use, not enough to cause any inflation. But like past savers, they will be guided towards a deflating coin. Changing coins is never easy also because currency gains infrastructure and thereby monopoly status. Bitcoin has an infrastructure like the dollar that pushes workers towards adoption even if it were not deflating. The existence of cryptocoins may not increase ability to switch, so the above problems may be unavoidable.

Once a monopoly like Google, FB, youtube, and Amazon are established, switching is not easy unless the underlying technology changes everything (Amazon replacing Walmart, Wikipedia replacing encyclopedias). The difficulty in switching is the degree to which abuse will occur. The free market promotes monopolies and is thereby not really free as a result of people needing a single standard in many things. Lack of a standard is chaotic in a bad way.

A constant-quantity coin is really a deflating coin, which adds the problem of it acquiring monopoly status without the macro-economic merit that a constant-value coin would have.

Even miners saving transaction fees is a fundamental problem as it forces value to increase beyond the amount they invested to acquire it. A saver's unjustified gain from deflation is a late-adopter's unjustified loss.

Loaning the coin amplifies this effect. Debtor's will be pushed by loaners into the coin for this reason, the same way the IMF and World Bank pushed 3rd world countries into loans based on an external currency (they pushed the dollar in Latin America and Asia, and the Euro in Greece, Latvia, Iceland, and Ireland). As those economies collapse(d), they do (did) not have the ability to reduce the quantity of the coin in which the loans were made, which is a type of default. If they had been allowed to take the loans in their own currency, the loaners would have made DARN sure the society would succeed as a result of the loans, instead of pushing them into selling off natural and tax-payer-built infrastructure. Loans should be like stock investments where the loaner has a vested interest in the success of the debtor, preventing the debtor from becoming enslaved. A contract with such one-sided terms should not be legal (heads you and I win, tails I win and you lose). A free market with these types of loans with no government oversight has a "benefit" and justice served: population reduction where the loaners will eventually have killed the enslaved and then have to start working themselves.

Bitcoin could be used mainly as a store of value. For example, banks and rich people could choose it for transfers without increasing its use in market pace transactions. Miners would sell profits as they are gained for a currency that expands and contracts with society's productive capacity and thereby has a stable value for wages and purchases. This still results in early adopters unjustly gaining at the expense of late adopters.

Here are some related tweets
Like every other holder, I hold bitcoin because I want to be in the ruling class, getting something for nothing.

Wanting to get something for nothing is a ruling class moral.

If BTC succeeds, we will be to that extent. We gain at later-adopters' expense, getting more of world's pie w/o contributing to it.

AndreasMAntonopoulos
Not like every other holder. That's your reasoning but it's not universal.

AndreasMAntonopoulo
Banks can't adopt a public, open, borderless network like bitcoin. The regulations prevent them from doing so
They can buy it up before the masses, restricting access to it, driving it up, driving up control.

If the banks adopt faster than the masses, change is questionable. They did it in the late 1800's with gold.

Tuur Demeester
I don't hold BTC to be in the ruling class — I hold it because I want to be free...

Your freedom comes at expense of late-adopters' financial freedom. Constant-quantity coin is ultimate banker CTRL of poor.

Great irony = BTC holders imagining moral or macro-economic benefit, but I shouldn't ascribe selfish motives to selfish outcomes.

World's problem is inefficiency of muscle, brain, bone, and photosynthesis. Solution: machines displace biology. It's accelerating.

Solar cells 30x more efficient than plants. Doubling fast. Using brains in future = like using shovels to dig. Not good end 4 DNA.

How are coin coders/miners/holders different from gov/banks/voters? No difference?

Miners/coders/holders prop each other up at expense of new entrants. Ladder-up=holding coin & loaning it out @ 5% BTC

Constant currency may work in Germany, but not most places using anglo-dutch lending. https://books.google.com/books?id=xH5w

1) It must be at least mixed as transactns will be miners' fees. No trans=no protection/value. Making more transactions possible...

2) is a bad thing because that allows it to be used for wages & prices instead of large asset transfer. If wages & prices made ..

3) in BTC then it becomes entrenched & usable as a weapon against non-wealthy. Or rather, loans in it should not exceed debtors...

4) assets, at least on macro scale so that debtors are not in a required downward spiral unless ruining debtors is a goal. Loans

5) should be an at-risk partnership like a stock where creditor does not succeed unless debtor succeeds. Destroying debtor at ...

6)creditor's gain wrongly makes loans diff from stocks: heads creditor wins, tails creditor wins = anglo/dutch loans,not in Germny.

It does nothing fundamentally different. Merchants-workers-customers transacting in it will run afoul of coders/miners/holders..

1st group wants stability or inflation to fight against the "bankers" trying to make it deflate.

holders="immoral" on accident if it becomes default transaction currency instead of mainly large transferr payments like gold.

Once mining stops, If you do not pay your taxes (fees), you're isolated from transactions.~same

A world BTC w/ infrastructure-based-monopoly on currency & held by banks loaning @ % = unmitigated disaster due to fix-quantity.

They would continue to gain BTC, driving its value up, needing to loan out less & less = parasites choking off economic activity

This is called "the magic of compound interest" applied to a fixed quantity coin, which is worse than http://michael-hudson.co

Switch just as hard as now if govs required tax payment in BTC (banking/Satoshi lobbies) & 90% merchants & apps required it.

$ is dominant not by force but like youtube, VHS, Google, M$ etc monopolies: society needs a single infrastructure=bitcoin danger

So my logic applies only if "bitcoin everywhere". Switching could prevent evil, but world stuck on $ did not prevent evil.

Youtube relevance = infrastructure = BTC relevance. World not forced on $ except by financial system infrastructure.

comment on an economic book review, Michael Hudson's "Killing The Host"

This entire book is explaining how Marx was clueless when it came to the dangers of finance capital displacing industrial capital. As Hudson explains, Marx was far too optimistic about capitalism. Marx thought banks would make loans to build capital infrastructure, not to merely to gain access to iPhone patents nor to buy 4 years of work from 2 moderately-intelligent people for $1 billion dollars (Google, FB, Snapchat, Whatsapp, Youtube) merely as a way to gain market-share eyes to advance monopolies on search, social connections, messaging, and videos. But the worst of the anglo-dutch loan situation is the loaning to drive up asset prices that stifles worker cost-efficiency and diverting what could have been tax revenue instead of bank profit, and the loans to take over infrastructure in order to sell it off, making a profit for the hawks and banks mainly by nullifying retiree plans and capitalizing on adverse stock market fluctuations. Notice there is no illegality in any of this detrimental behavior. There is only an absence of a functioning government that could stop macro activity that hurts the strength of us all. Good government enables and advances system-wide profit from cooperative behavior (prisoner's dilemma solution) at a level above the individual transactions. Individual transactions have zero concern about the system-wide effects of their transactions. A functioning healthy body results from the development of a governing brain over the system that keeps the cells from acting merely for highest profit (cancer).

Notice that Veblen (the only author Einstein liked as much as Bertrand Russell) and Marx's home country (Germany) is the one dominating the EU production world by NOT following anglo-dutch debtor-as-slave (heads bank wins, tails bank win) which is very different from stock investments where "creditor" (investor) depends on success of debtor.** Or rather, regulations from GOVERNMENT (oh, the horror) on loans in Germany enabled Marx's capitalistic axioms to actually apply, propelling Germany to recurrent greatness without the fallout Marx thought would occur. How close is Germany to what Marx desired? "Communistic" oligarchies like old Russia and old China from 1930's to 1989 (not the new ones) are not "Marxism" in action, but strong democracies are not far off as I'll explain.

**Except German attitude towards Greece where German banks want to follow anglo-dutch rules.

But as far as a world-order goes, this reviewer might be right: Hudson's godfather was Trotsky, who differed from Lenin in his Marxism by saying the world instead of individual countries should defend itself against capitalistic lobbying of government. Marxism was at its core an uprising against government from worker consciousness. How ironic "free market" capitalists also want an overthrowing, except they want some purely theoretical form of cooperative anarchy instead of "for the people" socialism. We need more good governing and less bad governing, not more anarchy. In the 3rd world there is not enough tax income to create a functioning government. It results in a very good model of anarchy. It can improve if outsiders allow them to engage in equitable trade, but only by getting together and forming a functioning democratic government ("capitalistic socialism"). Oligarchies are often the result of a democracy that failed to be as socialistic as the oligarchs promised. Not kicking private companies out is compatible with socialism. "For the people, by the people" is a socialist slogan, not just an American one. Venezuela is a democracy, pretending to have socialism when it has little. Price controls (Nixon) and oligarchies (Venezuela) are not socialism, but they did occur in democracies (or "Republic" for the unimaginative). When a democratic vote turns into an oligarchy people wrongly yell "socialism". They should yell "democracy". The goal of lobbyists is to turn a democracy away from socialism into legalized capitalistic monopolies having a free-for-all via bad laws. Is it the government's fault the lobbyists are in the way, or the voters? Will those former voters function better in anarchy?

Socialism in the U.S. is being subverted by democratic voting that is allowing capitalism to lobby the government.

The entire reason democracy exists is so that it can bias capitalism in favor of socialism by giving everyone a single vote, subverting the non-equal money in capitalism, giving the poor future opportunity.Workers do not own all the shares like Marx wanted, but they can tax away ridiculous profits that were unfairly gained from capitalizing on society's need for a single-Search, single-social site, and dominant messaging apps. They succeed not by work, intelligence, or moral superiority, but by merely being the first that was good enough for society to bless with its need for a single-player monopoly.

Saturday, April 16, 2016

Bits are most efficient, entropy/energy => Zipf's law, new memory systems have a solvable problem

This will need a strong interest in Shannon entropy, Landauer's limit, and Zipf's law.

Consider a message of N symbols composed of n unique symbols, and the unique symbols have an energy cost that it is a linear function. For example, a brick at 10 different heights on an inclined plane. To be distinguishable states from a distance, they have to be some fixed minimal distance apart. If the inclined plane is not even with the ground, but raised a distance "b" and the vertical distance between distinguishable states is "a", then energy required for each symbol is E=a*i+b where i=1 to n assigned to the n unique symbols.

It will cost a lot of energy to use each symbol with equal probability, but that is what gives the most information (entropy). What is the tradeoff? What function of "i" usage will result in the highest information per energy required? Underlying all this is the energy required to change a bit state, E=kb*T*ln(2), Landauer's limit. "a" and "b" in my linear E equation above have this E as a constant in them. a+b>kb*T*ln(2)which I'll discuss later.

The Zipf law (Here's a good overview) is a good solution in many energy-increasing profiles of the symbols. Zipf-mandelbrot law is Ci = C1*p/(i+q)^d where p=1 and q=0 for this discussion because p will cancel and q, at best, will make 100% efficiency possible when 80% is possible without it, so I'll not investigate it except to say using d=q=5x the d's I gets 100% in many cases.

So I'll investigate this form of Zipf's law:
Ci = C1/i^d
where Ci is the count of symbol "i" out of n unique symbols occurring in a message of N total symbols, C1 is the count of the most-used symbol, "i" is the ranking of the symbol, and what d values are optimal is what all this is about. Note: max entropy (information) per symbol occurs when d=0 which means every symbol occurs with same probability. [update: I found a paper that compares normalized entropy (inherent disorder from 0 to 1 without regard to n or N) to Zipf's law. The most interesting text, half way between order and disorder with normalized entropy=0.5 is when d=1 for very large n, greater 50,000. Shakespeare has d=1 verses others of only d=0.8. Language approaches this, or well exceeds it if you consider word pairs as symbols, and we do indeed remember them in pairs or more. For small set of symbols, n=100, n=100 to get H=0.5. Maybe the word-types diagrammed could be a set this small and follow this rule if the writer is smarter. http://arxiv.org/pdf/1001.2733.pdf ]

Note: using a=1, it turns out my efficiency equation below has 100% as its max.

There is an ideal Zipf factor (d) that determines the way computers should use symbols in order to give max bits/energy if they are going to shift towards memristor-type memories and computation that uses more than 2 bits. The d will increase from our current ideal of 0 as the energy baseline b divided by the number of unique symbols decreases as computers get better.

Heat generated by computation is always the limitation in chips having to synchronize with a clock because at high clock speeds, the light signal can only travel so far before it is no longer letting distance parts be in sync. A 10 Ghz square wave that needs everyone synced up in 2/3 of its cycle can only travel 2 cm down the twisted wiring inside a 1 cm chip at the speed of light. This means the chip is limited in size. As you pack more and more in, the heat generated starts throwing low energy signals out of where they are supposed to be. Using higher energy signals with higher voltage just makes it hotter even faster.

From this, one might not ever want to consider going to more than 2 symbols because it is going to require higher and higher energy states. That's generally true. The 66% my equation gives for bits can be made 80% even when using more than 2 symbols by letting d=2.4, if it is a very efficient idealized computer.

After numerical experimentation with the equation below, I've found (drum roll please):

d=0 if b greater than 10*n and n greater than 2 (modern computers)
d=0.6 if b=n
d=1 if b=n/3 and n greater than 9 (language)
d=1.5 if b=10 and n=b^2.
d=2.4-2.7 if b less than 1, n <1 and="" n="" nbsp="">100 to 30,000 (idealized computer)

An equation for d as a function of b and n could be developed, but seeing the above was good enough for me.

Bits are by far the most energy-efficient (bits information / joule energy cost) when maximizing the entropy per N symbol (i.e., when you require the symbols to occur with equal frequency).

d=1 does not, by far, use equal frequency of symbols which is what is required for maximal entropy per symbol used in a message of N symbols. This is partly why languages are only 9.8 bits per word out of a possible 13.3 (Shannon's entropy H=13.3 bits per word if words were used with equal frequency because 2^13.3 = 10,000 words)

New ideal memory systems having more symbols per memory slot and operating at the lowest theoretical energy levels (b=0) under this model (E=a*i+b) will be 0.06*n times less efficient if they use the memory states (symbols) with equal frequency (which is max entropy per symbol) than if they follow the rule of d=2.4. For example, if you have 100 possible states per memory slot, you will be 6x less efficient if you use the states with equal frequency. For larger b of systems in the near future with b=10*a and n=b^2, d=1.5 is 2x more efficient than d=0.

Side note:  Here is a paper that assumes it is logarithmic instead of linear. It may be an exponential function is many situations: words rarely used may be hard to think of and understand, so it may be more efficient to use 2 well-known words that have the same meaning. Conversely, trying to put together two common words to describe one nearly-common word can be more costly.

The a+b is also the minimal energy a "0" state on a computer needs to be at above background temperature energy fluctuations in order to have a VERY small probability of accidentally being in that state.   The probability of a thermal energy state is e^(-(a+b)/kT), so a+b needs to be a lot greater than kT, which modern computers are getting close to having problems with. Distinct symbols contained in same memory slot would have to be to be energies above this. Computers use only 2 states, 0 and1, but I am interested in language with words as the symbols. We require different amounts of energy when speaking and in deciding the next word to speak. The energy between each of these separate symbols could be less and less and maintain the same reliability (logarithmic instead of linear), but if the reader of such states has a fixed ability to distinguish energy levels, then there is an "a" slope needed, having units of energy per symbol.

What is the most efficient way for the sender to send a message of x bits? "Count_i" below is the count of unique symbol i.

Actually the units are joules/joules, because as I said before "a" and "b" have a k*T*ln(2) constant in them based on Landauer's limit and how reliable the symbols should be.

Also, it's not H=1 at the end but H=log2(n), so my last line should be Efficiency=c*log2(n)/n^2 when the counts are all equal, but the point remains: from the last equation, bits are the most energy-efficient way of sending information if you are requiring the symbols to occur with equal frequency, which is normally the case if you want the most information per symbol. But intuition already knows this: it's no surprise at all that using the 2 symbols that require the least amount of energy will be the most energy-efficient at sending entropy. n can't be 1 because entropy = 0 for that limit.

This does not mean using only the 2 lowest-energy symbols will be the most energy-efficient method of sending information. Remember, the previous paragraph requires the symbols to be used with equal frequency. It's a good assumption these days in wasteful computers, but it is not the most energy-efficient method when the computer is efficient enough to be affected by the energy cost function of the symbols. On less-frequent occasions, you could use the energy-expensive symbols to "surprise" the receiver which carries a lot more information. To find the optimal usage, the efficiency equation above must be optimized by experimentation because it's easy to analytically solve only when H = 1 (or log2(n)). doing that resulted in my 4 claims above.

Here is a 2015 paper that professionally does what I am going to do, but they just look at logarithmi energy functions which I agree might be better for things like cities, but for signal generators, I think the distinct energy levels between state is an important category.

The rest of this post are my older notes on this. It will ramble. Most of it is trying to see how an energy cost results in Zipf-like distributions.

Zipf's law states that 1/rank is roughly the number of people in a city times the population of the largest city. It seems to usually be based on a feedback effect where efficiency is gained by more "people" joining causes an already-efficient thing (like a city, or buyers of a product or service) to become more efficient. Mandelbrot increased its accuracy by using c1/(rank+c2)^c3. I'm going to use to find the best constants for different energy functions that result in the highest entropy. How does a set of cities correspond to seeking highest entropy/efficiency? I might have the sequence backwards: it might be that since people are potentially randomly distributed (high entropy) that the only thing guiding their selection of city is based on efficiency. There's not one mega city because its success in some ways causes its failure in other ways. It can be a fractal pattern from seeking the more efficient nearby options, like side sparks on lightening bolts and blood vessel routes being established, if not brain wiring. Current in wires will seek other routes as highly efficient (low resistance) routes become too heated from too much success. After defining symbols to represent these situations, it will be found to follow some zipf-mandelbrot law. But I want to derive it. Mandelbrot talked about fractals and the erngy cost also. We have market dominators like Walmart and Amazon who do a good job because lots of people support them, and lots of people support them because they did a good job. So it's a feedback. It's been said China and Soviet union cities, at least in the past, did not follow the rule, apparently because aggressive socialism interfered with market forces. Others tried to dismiss it as just a statistical effect, but those efforts were never convincing and are out of favor.

In terms of word frequency, there can be feedback. "a" is an efficient word. A lot of words like to use it, probably because a lot of other words like it. By everyone using it, it might have become very price for accuracy or general. It might have been said less efficient in the past, but it then got used more which made it more efficient. My point is to mention this feedback before carrying on with relating the resulting energy efficiency with frequency of occurrence. It also shows my efficiency equation might need to be logarithmic, but unlike others, that is not my starting point.

If we find a little profit from easy action, the language of our action will have a high count for that action despite the small profit. And if actions of an ever-increasing amount of energy can be combined in a clever way (high entropy), we can get a lot more energy from the world by carefully selecting how often we take the small actions compared to the energy-intensive actions. But by making these actions interdependent to achieve a larger goal (including the order in which they occur) then a lot more "heat" can be generated. "Heat" needs to be used imaginatively here to apply, but it does apply in a real way.

If we have a bunch of levers to control something, we will make sure the levers we use the most will have the lowest required energy for action. Short words are easier to say, so we give them the definitions we need to use the most. The mechanics of speech and hearing do not allow for us to choose the next syllable out of every possible syllable with equal ease. The most frequently used words, syllables, or whatever sequences are the ones that require the least amount of food energy. We've assigned them to the things we need to say the most. But that's probably not exactly right: we say things in a way that allows use to use the easy words more frequently.

I tried 100 symbols in excel with a=1 and b=0. Result: H=1 (d=0) is half as efficient (13%) as counts_i=1/i which had 27% efficiency. Value of "a" does not matter because it kills both efficiencies. I tried d/(e*n)^c as a count per symbol factor, and 2.6 for c was best at 80%. The d and e had no effect. 1/(n+d)^c got it higher and higher to finally 0.99 as both d and c rose above 6 with n=100. It might needed to be higher with higher n. If a of the linear function changed, the 80%increased by 1/a. With significant b, c wanted to come down to 1. And the d and c going up did not help much and needed to be lower. At a=1 and b=10, c=1.5 was good. Higher b=100 needed to be closer to 1, but 0.5 to 2 was OK, no so closer to the law for the cities.

In conclusion, the frequency=1/rank^(<1 accurate="" b="" be="" for="" nbsp="" rule="" should="">>a which is probably normally the case (speaking requires more energy than choosing what to say), if we are selecting words based on the amount of energy requiring in choosing words. Trying to determine the energy function is a fuzzy thing: it includes how hard it is to say some words after others and how good you can recall words. If this later effect is large, great variability between writers might be seen. When an author sees little difficulty in using a wide variety of words, the energy difference will be less and he will be closer to use words more evenly with a smaller power factor in 1/r^c. c=0 is the same as saying H=1 (aka H=log2(n)). Reducing n did not have a big effect. I tested mostly on n=100. I see no reason to think that this idea is not sufficient or that it is less effective than other views.