Wednesday, June 28, 2017

Zawy v2 difficulty algorithm

#!usr/bin/pseudo_perl
#
# Zawy v2 difficulty algorithm 
# Simple averaging window with option to use dynamically varying window.
#
# Core code with fluff removed:  (TS=timestamps)
#  if (current_TS > previous_TS + 6*T )  { current_TS=previous_TS + 6*T; }
#  elsif ( current_TS < previous_TS - 5*T )  {current_TS=previous_TS - 5*T; }
#  next_D = sum(last N Ds) *T / [max(last N TSs) - min(last N TSs] / (1+ln(2)/N)
#
# Changes:
# A preference for low N.  
# Added an adjustment factor that is important at low N: 1/(1+ln(2)/N).
# Includes option for adjustable averaging window (probably inferior to simple low N).
# Includes timestamp limit for timestamp manipulation/error protection.
#
# Miners should be asked to keep their timestamps accurate or it will help attackers and
# block release will be slower.
# Low N is suggested for best hash-attack protection.
# See verbose text at link below for explanations (if this is not verbose enough)
# https://github.com/seredat/karbowanec/commit/231db5270acb2e673a641a1800be910ce345668a#commitcomment-22615466

# Begin setting constants for this coin

T = 240;   # Target interval
MinHashAttackDuration = 8; # the minimum multiple of T that you expect attacks to last. Sets N.
timestamps_are_provably_correct = "no";  #  "no" if miners or pools are assigning timestamps
employ_variable_averaging_window = "yes"; # see paragraph below

# End setting constants for this coin

# Modifications to the logic of the algorithm below this point is not recommended.
# Trying to fix something usually breaks something else. 

# Set averaging window. I have a lower limit of 8 based on the effect timestamp errors can have.
# Every time someone sets the timestamp to a max or min, it often jumps to the max or min change in D.  
# There is no fix. 
# N=8 has only 47% more >4*T solvetimes than N=30.  Low N is my mantra. I recommend trying 8.
# Even 4 can work, even with timestamp errors. 
N1=int(MinHashAttackDuration);
if ( N1 < 4) { N1 = 4; }

# Notes if you use a variable averaging window:
# It will be smoother most of the time, but takes a ~N/4 blocks longer to respond and recover from 
#  sudden hash changes. Seriously, just choosing the smallest N is probably best unless
# you have a strong desire for a slowly-changing difficulty.  
# Description: decrease the statistical significance required for a HR change to trigger to a 
# lower N averaging window from +/- 2.0 to +/- 1.0 STDevs from estimated mean of HR in past  
# N samples as N increases from N1 to 2N1. It sometimes will reach >4N, as statistics dictates. 
#  The formula works as it should.  I've spent days testing it.
# This seems to increase solvetimes by about another 1/(1+ln(2)/N) factor which I'm not adjusting for.
Smax=2; Smin=1; # STDevs away from mean.

# Upper & lower TS limits from previous TS to protect against timestamp manipulation.
if (timestamps_are_provably_correct == "yes" ) { UTSL= 20; LTSL =-18: }
else { UTSL= 6; LTSL= -5; }

# The following are kind of fail-safes. Possibly not needed and definitely can negate 
# intentions of other parts of the algorithm.
# Bringing them closer to 1.0 as in other diff algos reduces hash
# attack protection.  Not letting them be symmetrical has hard to detect risks.
# These values are aggressive in allowing D freedom to change 
# and seem to work without breaking anything.
# Lowering these is similar to increasing N and narrowing the timestamp limits.
MI=  2; # max Diff increase per block 
MD= 0.5; # max Diff decrease per block  MD=1/MI is symmetrical?

# End setting of constants by the algorithm.

# definition: TS=timestamp

#  Begin algorithm

# Protect against TS errors by limiting how much current TS can differ from previous TS.
if (current_TS > previous_TS + UTSL*T )  { current_TS=previous_TS + UTSL*T; }
elsif ( current_TS < previous_TS - LTSL*T )  {current_TS=previous_TS - UTSL*T; }

if (employ_variable_averaging_window == "yes") {
     for (I=N1 to N) { # check every window that is smaller than current window.
        # create linear function: STDev decreases as I (aka N) increases to N
         STDevs = Smax-(Smax-Smin)/(2*N1 - N1)*(I-N1); 
         NE = (max(last I timestamps) - min(last I timestmaps)) / T; # expected N for this time range
         if ( abs(I - NE) > STDevs*NE**0.5 ) { N=I; } # This is the core statistics. It's correct.
}
}
else { N=N1; } 

next_D = sum(last N Ds) *T / [max(last N TSs) - min(last N TSs] / (1+ln(2)/N)

#the above is same as
#  next_D = avg(last N Ds) * T / avg(last N solvetimes) / (1+ln(2)/N)

if (next_D<=0 ) { next_D = avg(last N Ds) ;  # do not let it go negative.

# Do not let D rise and fall insanely as security precaution
if ( next_D/previous_D > MI) { next_D = MI*previous_D }
if ( next_D/previous_D < MD) { next_D = MD*previous_D }

# increase size of averaging window by 1 per block 
if (employ_variable_averaging_window == "yes") { N=N+1; } 

Argument that low N is best in difficulty algorithms and why dynamic averaging window is not a benefit

I can't recommend a switch from v1 to v2 (static N to dynamic N).  The smoothness gained by the higher N is not much:  surprisingly, the std dev of solve times increases only 5% from N=30 to N=8. The std dev of D goes from 0.18xD to about 0.45xD for N=30 verses N=8.  For N=8 this means 97.5% are less than D=1.96x0.45=2 times more than they should be) . Long story short (due to Poisson median being 0.693 of average): going from N=30 to N=8 means only a 47% increase in 4xT solvetimes.  The dynamic window does not capture this benefit:  those > 4xT solvetimes are exactly the statistically unlikely times that will trigger the dynamic window back to a lower N, canceling the primary benefit of it rising back up to large N.  It looks a lot smoother and nicer _**most**_ of the time when hash rate is constant, but the painful small N events are not reduced.

Tuesday, June 27, 2017

Cryptocurrency difficulty algorithm postulates


Here are my "difficulty algorithm postulates" I want people to consider before creating or changing a difficulty algorithm.
  1. For a given hashrate with gentle variation, the simple average below is the best algorithm: 
    • Next D = avg(past N Ds) x TargetInterval / avg(past N solve times)
    • For whatever reason, it needs an adjustment factor for low N to keep solve time on track and make D more accurate:  Next D x 1/(1+0.7/N).  
    • The N used for averaging past D must be set to the N used for past solve times.
    • Using median is not near as good as using average and there is no benefit to using median.
  2. A faster response to hashrate changes will come at a cost in solve time stability. This is not a bad thing. Use the lowest N you can tolerate to get the fastest response. Low N causes large non-attack solve time variation. Consider down to N=8 if hash-attacks are a problem.
  3. Limiting the rise and fall in the difficulty per block is similar to increasing N, but is much less accurate.
    • I place limits on the rise and fall to be equal to what I think possible only as a security measure. 
    • Placing limits on the rise and fall to block an event you do not want is denying the truth of the observation that you have asked the average to report.
  4. Enforcing asymmetric limits on difficulty and timestamp changes are risky.
    • There is a temptation to allow faster decreases than increases in the difficulty per block (that results from the average above) in order to get back to normal after an attack. This may help keep block emission rate on schedule and reduce normal miner loses.  But it also enables attacks to resume more quickly which might exactly negate the two benefits. Avoid this more seriously if the attacker is intelligent.  If timestamps are assigned by miners, forward-stamping (combined with this asymmetry) will make D begin artificially lower in the next attack, amplifying the original problem instead of helping it. But if the allowed increase and decrease in D is symmetrical, then a subsequent accurate timestamp that negates the previous bad timestamp will be able to get D back to its proper value.
    • Conversely, there is a temptation to allow faster increases than decreases in difficulty per block in order to dissuade on-off hash attacks.  This directly will slow block emission rate. It potentially increases normal miner loses if it does not actually dissuade attacks. Avoid this more seriously if the attacker is dumb. It better enables a malicious attack that is not interested in profit to drive the D up, or to drive it up for the purpose of causing future oscillations if the diff algo is unwisely advanced and complex..
    • Limiting the amount the timestamp can be ahead of time more than it can be negative is like allowing D to increase faster than decrease, with the same type of side effects.
    • Limiting the amount the timestamp can be negative is like allowing D to increase faster than it can decrease, with the same type of side effects.
    • Symmetrical in the above is not exactly linear because the median of a Poisson with mean TargetInterval (T) appears to be ln(2) x T = 0.693% of  T but I have not addressed this.
    • Timestamp limits: I believe the forward timestamp should be limited to +6x and -5x previous timestamp instead of my previous statements of +6x and -6x because the "expected" timestamp is 1x, so  +6x and -4x is mathematically required.  But I want a -5x limit in violation of perfect symmetry out of more fear of greedy +6x occuring than -4x accidents or maliciousness. Reminder: two -5x in a row could cause a negative difficulty if N=8, so there needs to be protection against a negative difficulty. 
  5. Despite 3 and 4, there may be a way to use them to enable D to return to normal more quickly in post-attack.  This is really needed because the avg solve time increases (delaying coin release) when there are a lot of large on-off instances because even with low N, D is getting stuck high in post-attack.
  6. There is no way to stop >50% miners from using the timestamp to make difficulty = 0.  This assumes there is not a trusted third party enforcing a clock (like ETH) which is in violation and Szabo and Satoshi mandates. 
    • Might is right. 51% power is truth.
    • 51% (or the trusted 3rd party) controls the clock which means they control coin emission rate.
    • Bitcoin uses > 50% consensus to certify not only single-spend transactions but also the time.
    • Fear of a hard fork may be what prevents miners from doing this overtly.
  7. Difficulty algorithms should not have momentum. Predictive algorithms that look at the slope of recent changes in D to estimate a future D are vulnerable to large on-off miners (and possibly even an accidental and unconscious consortium of miners in search of personal profit) who can force the algorithm into oscillations, turning on when D is low and is starting to rise, and off before it reaches a peak. This is the Derivative part in PID controllers. PI controllers such as the average of the past are safer.
  8. Algorithms that try to protect against specific attack behavior are inherently vulnerable. 
    • It should be assumed that protection against specific attacks is automatically leaving an unexpected hole.
    • If an opponent can see the strategy you've employed that assumes something beyond your scientific observations, he can change his plan of attack but you can't change your defense.
    • For example, if you choose a fixed N based on how long you expect attacks to last, the attacker may make the attacks shorter but more frequent. 
  9. Miners acting in their own best interests weed out weak coins, are the mothers of invention, and/or are encouraging adoption of a single coin. Each of these might be "good" instead of an "attack".
    • Item 6 may mean all coins that are not the largest for a specific type of hardware are destined for a limitation on their size (if not outright failure) that is more brutal than Zipf's law.  We currently see something like Zipf's law in cryptocurrency market caps but if item 5 is correct,  it might become 1/Rank^2 or worse in market caps instead of 1/Rank.  This enforces Satoshi's original vision that the largest coin's "might is right" will make it less subject to attack than its clones. 

Sunday, June 25, 2017

Advantage of a simple difficulty algorithm with low N, but a tweak is needed

The following looks like a good zawy v1 to consider. If this is acceptable even when the hashrate is constant, then the switching from high N to low N  (zawy v2) is not needed.

- N=6
- The low N causes avg solvetime to be 15% longer (i.e. D is 15% too high), so increase the solvetime average by 15% with 1.15 x avg(6 solvetimes).  I believe this is caused by the combination of a low N and the median of a Poisson being 0.75 of the average.
- +/- 5x timestamp limit from previous timestamp.
- D per block change limited to 0.5x down and 2x up.
- The above two should remain symmetrical unless testing supports something different.  See last paragraph.
- There should not be any attempt to address bad timestamps other than the above TS and D limits. If a bad time comes in that lowers D to the 0.5 limit then the next correct timestamp will bring it back up.  If you're getting two bad timestamps in a row in the same direction very often, the problem is not fixable. Larger N might help.
- Always averages, no medians.

The timestamp manipulation will only help pools with > 50% hashrate. Other miners should keep their clock accurate to keep difficulty more stable when N is low.  Otherwise, every miner suffers from an unstable D: an unstable D causes blocks to be released more slowly:   the avg of a D ranging from 0.5 + 2 is always more than the avg of a D that stays at 1.  I have not addressed this averaging effect and it leaves open a possibility of improvement:
When there are frequent on-off high hash attacks, the above N=6 (and any other algo that provides good protection) will slow the avg solve time. I see a >30% slowing of block release with frequent on-off attacks.  Slow increases and decreases are not a problem: the sudden on-off nature of "attacks" slow block release even as they get blocks quickly at low difficulty. So it's a double-whammy to honest miners.  This leaves open the ability to reduce the post-attack delays by accelerating the decline back to normal if it thinks an attack was present and is now ending.  Off-hand, I would keep a rolling avg D for something like N=100 and if the past N=12 were >2 avg(100 D) then
if Target/(1.15\*avg(6 ST))  < 0.95
change
next D = avg(6 D) x Target/(1.15\*avg(6 ST))
to
next D = avg(6 D) x Target/(1.15\*avg(6 ST)) x 3/4

The above is kind of a weak fix. I am trying to employ a different method, such as more rapidly being able to detect a post-attack event in a continuous fashion. Discrete if-then statements expose the weakest point to an attacker.  But the idea is to **show higher skepticism toward allowing D to increase as D gets higher and higher above a longer-term average, but less skepticism towards coming back down.**  I think this could be an improvement that is more important that the dynamic avg window.

In the crypto-anarchists worldview, miners taking advantage of the lack of a timestamp and difficulty algorithm not adjusting fast enough are really just capitalists investing in adventures that persuade others to employ fixes.

Saturday, June 24, 2017

Equations cryptocoin hash and timestamp attacks

I remain confused that alt coins are not having a huge problem with >50% miners constantly forward-stamping the timestamps, causing difficulty to drop to zero. I have the following equations to describe the problems:
Let X = hashrate new miner adds to existing hashrate. For example if X = 2, the total hashrate after the new miner comes online is 2+1 = 3x so that he has 2/3 = 66% of total.
The unavoidable unfixable problem I see is that the difficulty will drop each block for any X > 1:
next D = previous avg D x [1/(1+M/N)]^[(X-1)/(X+1)]
M=the "M x TargetInterval" limit on the allowable timestamp, after the previous timestamp.
N=the averaging or median window.
For X=1 note that next D=previous D. See next equation for X<=1.
For any X>1, the difficulty will slowly decrease. Small M and large N help, but are not a fix. For example, M=3 and N=100 (the most protective numbers I can imagine), and X=2x, the difficulty will drop 1% per block for as long as the miner is present with 2x. In 450 blocks new D = 1/100th of old D instead of 3xD.
Median makes it more complicated, but I do not think it will fundamentally change anything.
For X<=1, the difficulty does not drop forever, but has a steady state value that is tolerable (assuming I do not have an error):
next D = correct D x [1 -(1 - 1/(1+M/N) * X/(X+1)]
for example, M=6x, N=30, and X=1 (50% hashrate) gives 92% of the correct D.
another example: M=6x, N=15, and X=0.5 (33% of hashrate) gives 91%.
I needed this because I want to go to N=8 in the dynamic averaging window:
M=6x, N=8, X=1 gives 78% which is tolerable.
For high on-off hash attacks where the off time = N, I have
DA / desired D = (1+ PX/2N ) / (X+1)
where DA = avg actual D during the attack, P = number of blocks (< N) that the attack lasts. For P>N, they are not getting away with anything after N, so P=N for P>N.
The attackers' "unjust profit" is
unjust attacker profit = other miner's losses = PX / (1+PX/2N).
There is no solution except to make N small.
This is why I'm working on a variable averaging window difficulty algorithm. N must be as small as possible when there is an attack, but larger at other times because small N causes a lot of accidental variation.

Friday, June 23, 2017

Protecting alt cryptocoins from bad timstamps and high hash rates

This is a possible solution to protect small coins against 51% miners who forward-stamp the time.

It appears any miner with 51% can cause difficulty to lower forever.  Median and average do not affect this.  1/2 the time he will acquire the median, so half the time he will lower it to 1/6 D, averaging 1/3 D every 2N.   So D=0 in 1 day.  If you cut the high 6 out of 42, as well as the low 6, the result is the same.  Cutting only the high 6 will cause difficulty to rise forever if miners are honest.  Again, 51%  = truth.   51% = owner of clock.  There is no solution except for the coin to self-destruct until miners are honest.

If I am correct, big miners must already know this, so they must be doing it only some of the time because they do not want it to cause too much trouble and then cause a hard fork.

Timestamp fix idea:  penalize all miners if some miners have a bad timestamp.

This will protect against a 20x attacker who is trying to control the clock. This numbers are approximate.It depends on how fast rest of difficulty algorithm brings difficulty back down.  If a negative block timestamp appears, raise difficulty 10x and lower 11% per block for 20 blocks, then resume regular difficulty.  If more than ~5% of network has a bad timestamp (either too far backward or too far forward that causes others to be backward), difficulty would rise indefinitely. Those not aware will stop mining. Those aware of this will remain until the miners with bad timestamps leave. It forces honesty and more stable difficulty. All miners are penalized if some miners keep a bad timestamp. 1% bad timestamps will cause 22% increase every 100 blocks.

Past coin holders are protected against coin dilution.

If miners have an error of either exactly  +20 or -20 seconds then 25% of the time a forward stamp will be followed by a backward stamp.    1.24% of solves are within this 40 second window when T=240.     0.25 of 0.0125 = 0.3% of the time this will occur on accident.  So everyone with ~20 second error is 1/3 as bad as 1% with 240 second error.  These numbers are approximate since the statistics is complicated but it shows the things that have to be considered.

A problem with the above is that a malicious actor with only 1% hashrate can make your difficulty jump 10x once every 100 blocks.

Wednesday, June 21, 2017

Zawy v2 difficulty algorithm (dynamic averaging window)



========

if ( this_timestamp - last_timestamp > 6*TargetInterval ) {

this_timestamp = 6*TargetInterval + last_timestamp;

}

if ( this_timestamp - last_timestamp < -6*TargetInterval ) {

this_timestamp = -6*TargetInterval +last_timestamp;

}

# next line prevents artificially low timestamp from affecting it.

if ( current minus last timestamp is not negative AND N >=19) {

# if we just saw 19 blocks when we expected 12 go to N=19

# will trigger on average once per 50 blocks on accident

if ( average(last 19 solvetimes) < TargetInterval/1.66 ) { N=19; wait=N; i=0; }

}

# next line prevents 8th solvetime from being artificially large

if ( none of past 7 solvetimes were negative AND N>=6) {

# If we just saw 6 blocks when we expected 12, go to N=6.

if ( average(last 6 solvetimes) > TargetInterval/0.50 ) { N=6; wait=N; i=0;}

}

}

# If we saw 5 blocks when we expected 1, go to N=5. This needs to be last.

# Will trigger about 1 per 250 blocks on accident. Detects >2x hash rate quickly.

# It is one-sided (for rise but not fall in hashrate) it may slow coin release rate a little.

if ( none of past 5 timestamps are negative) {

if ( sum(last 5 solvetimes) / TargetInterval < 1 ) { N=5; wait=N; i=0; }

}

# give outliers a chance to get out of the new avg range assigned above before letting

# N increase but it did nto seem to have large effect. Debating it.

if ( wait > 0 ) { wait=wait-1; }

else { N=N+1; }







Next_D= avg(past N Ds) * TargetTime / avg(past N solvetimes);




# Testing indicated allowing more than 1+2.5/Nstart increases and less than

# 1- 2.5/Nstart decreases did not help, where N=minimum. For Nstart=5

# these are 1.5 and 0.50 limits.




if ( Next_D > 1+2.5/Nstart * avg(past N Ds) ) { Next_D=1+2.5/Nstart*avg(past N Ds)

if ( Next_D < 1-2.5/Nstart* avg(past N Ds) ) { Next_D=1-2.5/Nstart*avg(past N Ds)




==============

I have what I think could be an excellent algorithm for all coins that would be especially beneficial to small coins, but it's a little difficult to understand. I would like sumokoin to eventually employ it so that monero and cryptonote can believe in it. They are currently showing an interest [here](https://github.com/monero-project/research-lab/issues/3#issuecomment-309427606) and [here](https://github.com/seredat/karbowanec/commit/231db5270acb2e673a641a1800be910ce345668a#commitcomment-22628026) but they will lose interest if we are not able to demonstrate something better on a live coin.




I want to demonstrate it in a spreadsheet today or tomorrow that it is noticeably superior, at least when timestamps are accurate. Then I want to harden it against timestamp errors as in the previous pseudocode. The long post at cryptonote above describing "Zawy v1.0" is my recommendation until this continuous version is finished.




In general, I want it to check for sudden hashrate increases or decreases and switch to the correct N if it detects an unlikely event. I want it to do it continuously, minimizing any constants that people select. Actually, people should not select any constants in cryptocoins. For example, total coin and coin release rate should not be set by devs, but by the marketplace. Likewise, the following will let the "market" of "network hashrate" "bid" for the correct N averaging time.




```

# This demonstrates the idea. I tested it last year.

# This is to be substituted in place of the 3 conditionals in previous psuedocode.

# I'll employ it in a spreadsheet to prove it it's better than simple averaging.

# It will not work well until it is hardened against timestamp errors.

# The previous pseudocode is hardened against timestamp errors and

# shows generally how the following needs to be changed




Nstart = 5 =minimal number of blocks we will check for statistical event

Nend = 36 = max number of blocks we will check for statistical event




# Go to N averaging = Nstart=5 only ~1% of the time by chance alone

STDEVstart = 4;

# Go to N averaging = Nend=36 32% of the time by chance alone

STDEVend = 1




# Now create a function using the above that will determine the amount of

# statistical significance we require before we switch to an N averaging that

# is between Nstart and Nend.

# I'll Substitute the above assigned values for clarity.

function STDev(NA) = 4 - (4-1)/(36-5)*(NA-5)




N=current number of blocks used in averaging, determined in previous code




# Test all past-block-ranges for a statistical event, from 5 to N

for NA=-Nstart to -N

NE= N_Expected_blocks_in_NAs_time= sum(NA previous solvetimes) / TargetInterval

S = STDev(NA)

NH = an NA above this should not have occurred in NE time within bound of STDev

NL = an NA below this should not have occurred in NE time within bound of STDev

NH = NE + S*SQRT(NE)

NL = NE - S*SQRT(NE) +1 # the +1 was needed in testing to make it symmetrical.

if ( NA > NH or NA < NL) {

# throw out earliest blocks in case they were before the attack or outliers. The +2

# prevents thowing out 2 out of 5. +3 might be better.

N=int(2*NA/3+2)

exit for loop, last NA;

}

}

```

Here are the results for 10x hashrate attacks. "Blue" areas indicate blocks were obtained at a low difficulty. Black area that is not on top of "blue" are blocks obtained at a costly difficulty. Everything is correct when black is on top of blue.


The "Zawy v2" algorithm is almost exactly like the theory-based pseudo-code which shows it is based on good theory. N=30 is not shown because the thin high-hash rates are only 15 blocks wide and N=30 does not give good results on N=15 attacks. Before the attacks, you can see a gentle rise to 2x hashrate and a drop down to 1/2 hashrate. Default hashrate and difficulty = 1, so the scale is accurate.


**edit:**


- with zawy v1 N=18 with these hash attacks, 10% of blocks were gained at < 1/2 the appropriate difficulty (attacker) and 10% suffered >2x the appropriate difficulty (constant miners).
- with zawy v2 it was 4% and 7%
- since the v2 dynamic averaging period went as low as 5 and there were 6 attacks in 970 blocks, 6x5/970 = 3% makes sense, as well as 6x18/970 = 11% for v1. The 7% is a little high because there is a 1440 limit on the timestamp that prevents long solve times for affecting it. (it's to prevent timestamp manipulation from forcing difficulty low)