This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
benchmarks:middlesquare [2017/05/07 01:20] pier4r |
benchmarks:middlesquare [2017/05/09 09:03] (current) pier4r |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | <wrap info> | ||
+ | ====== Calculator benchmark: middle square method seed test ====== | ||
+ | |||
+ | **The idea** | ||
+ | > In September 2013 I started to contribute on wiki4hp, and my first self-chosen task was: mine information from discussion places of HP calculators' | ||
+ | > Benchmarks are not the ultimate test to measure the absolute computational power of a calculator (or a computer), but they give a good approximation of relative speed between calculators; | ||
+ | > | ||
+ | > Now, if we look at code of the two popular benchmark, we can see that the // | ||
+ | > | ||
+ | > So thinking about a benchmark that use at least the four (+,-,*,/) operations, I stumbled upon the middle square method to produce, easily, random numbers((Even if the method has a lot of flaws from the point of view of randomness.)). Since we don't care about a particular benchmark algorithm, but we use the benchmark only to compare devices, it should be ok to test almost all programmable calculators, | ||
+ | > | ||
+ | >So, if you agree (else, feel free to fork the idea on this wiki!) let's look the pseudocode. | ||
+ | |||
+ | See also [[http:// | ||
+ | |||
+ | **Criticism**\\ | ||
+ | [[http:// | ||
+ | |||
+ | So actually this should be seen as challenge, and only after that as an approximative benchmark. | ||
+ | |||
+ | **The Pseudocode** \\ | ||
+ | As the two popular benchmarks shows, we don't test only the computational power of the devices, | ||
+ | but also the programming ability of the user to " | ||
+ | < | ||
+ | k:= a positive integer | ||
+ | n:= 100^k | ||
+ | For i:=(n/10) to (n-1) do { | ||
+ | old := i | ||
+ | sequenceIsConvergent := false | ||
+ | limitCyclicSequences := 0 //because we can end in cyclic sequences and we don't want to use a lot of memory! | ||
+ | while ( (NOT sequenceIsConvergent) AND (limitCyclicSequences < n) ) { | ||
+ | new = extract_middle_number from old^2 (see below) | ||
+ | if new == old { | ||
+ | sequenceIsConvergent = true | ||
+ | } | ||
+ | else { | ||
+ | old := new | ||
+ | limitCyclicSequences = limitCyclicSequences+1; | ||
+ | } | ||
+ | }//end while | ||
+ | }//end for | ||
+ | |||
+ | -> | ||
+ | |||
+ | How does extract_middle_digits work? | ||
+ | Given n, that is a power of 10 of the type 10^d, | ||
+ | its square is equal to 10^(d*2) and it has (d*2+1) digits. | ||
+ | |||
+ | We consider only the last (d*2) digits. | ||
+ | That is: if n=10000, 10000*10000 = 100000000 we consider only 00000000 without the first 1. | ||
+ | |||
+ | Then we ' | ||
+ | For example if d=4 and the number under process is 1542, | ||
+ | we consider 1542^2 as 02377764 instead of 2377764. | ||
+ | |||
+ | Why d=4 ? Because we use as seed, with n=10000, numbers from 1000 to 9999, so numbers having 4 digits. | ||
+ | |||
+ | After that we pick the d=4 middle digits, from 02377764 we pick 02[3777]64. | ||
+ | So the middle number extracted is 3777. | ||
+ | </ | ||
+ | |||
+ | The method of extraction for the digits can be anything you want, a standard method involving modulo and divisions or some other method. You can use an iterative subtraction to get the modulo instead of using the calculator' | ||
+ | |||
+ | **More details!** | ||
+ | < | ||
+ | Let k=1 , so we have as reference n:= 100^k = 100 | ||
+ | Do sequences for all the seeds from a:=n/10 to n-1. | ||
+ | So the seeds are {a: | ||
+ | |||
+ | How do we compute a sequence? | ||
+ | a:=10 | ||
+ | seed:=a | ||
+ | 1. we first square the number | ||
+ | 10*10 = 100 | ||
+ | 2. 100 squared is 10000, with 4 zeroes, then we consider the previous number with four digits (with a leading zero) | ||
+ | 0100 | ||
+ | 3. Then we pick the middle digits of the number, as many as the zeroes of n (that is 100), the new seed is | ||
+ | 0[10]0 | ||
+ | 4. The new seed is the same as the initial seed, then if we reapply the process we end up always with 10. Therefore the sequence is convergent. | ||
+ | |||
+ | Next step | ||
+ | a:=11 | ||
+ | seed:=a | ||
+ | seed^2 -> 121 | ||
+ | See it with 4 digits -> 0121 | ||
+ | Pick middle digits -> 0[12]1 , seed:= 12 | ||
+ | seed^2 -> 144 | ||
+ | See it with 4 digits -> 0144 | ||
+ | Pick middle digits -> 0[14]4, seed:=14 | ||
+ | seed^2 -> 196 | ||
+ | See it with 4 digits -> 0196 | ||
+ | Pick middle digits -> 0[19]6, seed:=19 | ||
+ | seed^2 -> 361 | ||
+ | See it with 4 digits -> 0361 | ||
+ | Pick middle digits -> 0[36]1, seed:=36 | ||
+ | seed^2 -> 1296 | ||
+ | See it with 4 digits -> 1296 | ||
+ | Pick middle digits -> 1[29]6, seed:=29 | ||
+ | seed^2 -> 841 | ||
+ | See it with 4 digits -> 0841 | ||
+ | Pick middle digits -> 0[84]1, seed:=84 | ||
+ | seed^2 -> 7056 | ||
+ | See it with 4 digits -> 7056 | ||
+ | Pick middle digits -> 7[05]6, seed:=5 | ||
+ | seed^2 -> 25 | ||
+ | See it with 4 digits -> 0025 | ||
+ | Pick middle digits -> 0[02]5, seed:=2 | ||
+ | seed^2 -> 4 | ||
+ | See it with 4 digits -> 0004 | ||
+ | Pick middle digits -> 0[00]4, seed:=0 | ||
+ | seed^2 -> 0 | ||
+ | See it with 4 digits -> 0000 | ||
+ | Pick middle digits -> 0[00]0, seed:=0 | ||
+ | Here the new seed is the same of the previous, so the process stops | ||
+ | |||
+ | let's skip some iterations to show another problem | ||
+ | a:=24 | ||
+ | seed:=a | ||
+ | seed^2 -> 576 | ||
+ | See it with 4 digits -> 0576 | ||
+ | Pick middle digits -> 0[57]6, seed:=57 | ||
+ | seed^2 -> 3249 | ||
+ | See it with 4 digits -> 3249 | ||
+ | Pick middle digits -> 3[24]9, seed:=24 | ||
+ | |||
+ | At this time you can do an infinite loop that is cyclic, | ||
+ | so a way is to limit any sequence to max 100^k iterations. | ||
+ | |||
+ | In this case one could consider the sentence "For a generator of n-digit numbers the period can be no longer than 8^n" on wikipedia, but to make it standard, let's limit it to 100^k iterations. | ||
+ | </ | ||
+ | |||
+ | **How to report a result** \\ | ||
+ | < | ||
+ | A result is composed by the following list | ||
+ | - the device used plus the language used, overclock if any, custom firmware if any and so on. | ||
+ | - time elapsed for a given k in seconds (see below) | ||
+ | - the code used. | ||
+ | |||
+ | if the calculator is too slow, or limited, to compute a given n, then report "for n=100^k the computation | ||
+ | takes too much time". Conversely, if the calculator is too fast to compute a given n, then report | ||
+ | "for n=100^k the computation takes too little time, I skipped it" | ||
+ | </ | ||
+ | |||
+ | ===== Physical calculators ===== | ||
+ | |||
+ | ==== Results for n:=100^k, k=1 ==== | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, ARM ASM | ||
+ | * Source: [[http:// | ||
+ | * Time: 0.0093 s @75mhz | ||
+ | * Notes: For the code, I obviously needed division. This time, I searched for proper implementations of division on the web and found [[http:// | ||
+ | * Code: ((< | ||
+ | CODE | ||
+ | A=0.W | ||
+ | GOSBVL POP# | ||
+ | GOSBVL SAVPTR | ||
+ | SKUB { | ||
+ | *start | ||
+ | !ARM | ||
+ | STMDB sp! {R4 R5 R6 R7 R8 R9 R10 LP } | ||
+ | LDR R2, | ||
+ | MOV R3,1 | ||
+ | MOV R10,10 | ||
+ | *ALOGloop | ||
+ | MUL R3,R3,R10 | ||
+ | SUBS R2,R2,1 | ||
+ | BNE ALOGloop | ||
+ | MUL R2,R3,R3 | ||
+ | MOV R7,R2 | ||
+ | BL divmod | ||
+ | MOV R4,R9 | ||
+ | *FORloop | ||
+ | MOV R5,R4 | ||
+ | MOV R6,R4 | ||
+ | *WHILEloop | ||
+ | MUL R,R5,R5 | ||
+ | MOV R10,R3 | ||
+ | BL divmod | ||
+ | MOV R7,R9 | ||
+ | MOV R10,R2 | ||
+ | BL divmod | ||
+ | MOV R8,R5 | ||
+ | MOV R5,R7 | ||
+ | CMP R7,R8 | ||
+ | SUBNES R6,R6,1 | ||
+ | BNE WHILEloop | ||
+ | ADD R4,R4,1 | ||
+ | CMP R4,R2 | ||
+ | BNE FORloop | ||
+ | LDMIA sp! {R4 R5 R6 R7 R8 R9 R10 PC} | ||
+ | *divmod | ||
+ | MOV R8,R10 | ||
+ | MOV R9,0 | ||
+ | CMP R7,R10 | ||
+ | MOVLO PC,LR | ||
+ | *.LSloop | ||
+ | MOV R8,R8 LSL #1 | ||
+ | CMP R8,R7 | ||
+ | BLO .LSloop | ||
+ | *.RSloop | ||
+ | MOVHI R8,R8 LSR #1 | ||
+ | CMP R7,R8 | ||
+ | SUBHS R7,R7,R8 | ||
+ | ADC R9,R9,R9 | ||
+ | CMP R8,R10 | ||
+ | BHI .RSloop | ||
+ | MOV PC,LR | ||
+ | !ASM | ||
+ | *end | ||
+ | } | ||
+ | C=RSTK | ||
+ | D0=C | ||
+ | D1=80100 | ||
+ | LC(5)end-start | ||
+ | MOVEDN | ||
+ | LC 80100 | ||
+ | ARMSAT | ||
+ | GOVLNG GETPTRLOOP | ||
+ | ENDCODE | ||
+ | </ | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, SaturnASM | ||
+ | * Source: [[http:// | ||
+ | * Time: 0.0359 s @75mhz | ||
+ | * Notes: just with Saturn ASM, though I have to admit that I used some instructions the real Saturn didn't have - only the emulated Saturn can multiply and divide with a single instruction. So what will ARM ASM bring? \\ Another note on optimization in this code: I let the limitCyclicSequences counter run backwards from n to 0, this saves an instruction or two in the WHILE loop compared to the pseudocode version. | ||
+ | * Code: ((< | ||
+ | CODE | ||
+ | GOSBVL POP# | ||
+ | GOSBVL SAVPTR | ||
+ | B=A.A | ||
+ | C=0.W | ||
+ | A=0.W | ||
+ | C+1.W | ||
+ | A+10.W | ||
+ | *ALOGloop | ||
+ | C*A.W | ||
+ | B-1.W | ||
+ | ?B#0.A | ||
+ | GOYES ALOGloop | ||
+ | R0=C.W | ||
+ | C*C.W | ||
+ | R1=C.W | ||
+ | C/A.W | ||
+ | R2=C.W | ||
+ | *FORloop | ||
+ | B=C.W | ||
+ | D=C.W | ||
+ | *WHILEloop | ||
+ | D-1.W | ||
+ | ?D=0.W | ||
+ | GOYES endFORloop | ||
+ | C*C.W | ||
+ | A=R0.W | ||
+ | C/A.WA=R1.W | ||
+ | C%A.W | ||
+ | A=B.W | ||
+ | B=C.W | ||
+ | ?A#C.W | ||
+ | GOYES WHILEloop | ||
+ | *endFORloop | ||
+ | C=R2.W | ||
+ | A=R1.W | ||
+ | C+1.W | ||
+ | R2=C.W | ||
+ | ?C#A.W | ||
+ | GOYES FORloop | ||
+ | GOVLNG GETPTRLOOP | ||
+ | ENDCODE | ||
+ | </ | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, SysRPL | ||
+ | * Source: [[http:// | ||
+ | * Time: 1.1421 s @75mhz | ||
+ | * Notes: Due to the size of the numbers this test produces I can't use normal bints if I want to run it for k=2. As a consequence, | ||
+ | * Code: ((< | ||
+ | :: | ||
+ | BINT10 BINT100 | ||
+ | BINT100 BINT10 DO | ||
+ | DUPINDEX@ BEGIN | ||
+ | DUPDUP #* 5PICK #/ SWAPDROP 4PICK #/ DROP | ||
+ | SWAPOVER #= | ||
+ | ROT #1- DUP4UNROLL #0= | ||
+ | OR | ||
+ | UNTIL 2DROP | ||
+ | LOOP 2DROP | ||
+ | ; | ||
+ | </ | ||
+ | - HP 50g , 2.15 HRAST BASIC | ||
+ | * Source: [[http:// | ||
+ | * Time: 1.94s | ||
+ | * Code: ((< | ||
+ | WATCH INTEGER K=1, | ||
+ | FOR I=N/10 TO N-1 O=I FOR L=N R=MOD(^O/ | ||
+ | NEXT ? TICKS | ||
+ | </ | ||
+ | - HP 49g, HRAST BASIC | ||
+ | * Source: [[http:// | ||
+ | * Time: 4.81s | ||
+ | * Code: ((< | ||
+ | WATCH INTEGER K=1, | ||
+ | FOR I=N/10 TO N-1 O=I FOR L=N R=MOD(^O/ | ||
+ | NEXT ? TICKS | ||
+ | </ | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, UserRPL | ||
+ | * Source: [[http:// | ||
+ | * Time: 10.6349 s @75mhz | ||
+ | * Code: ((< | ||
+ | << | ||
+ | ALOG DUP SQ | ||
+ | DUP 10. / OVER 1. - FOR a | ||
+ | DUP a DO | ||
+ | DUP SQ 5. PICK / IP 4. PICK MOD | ||
+ | UNTIL | ||
+ | SWAP OVER == | ||
+ | ROT 1. - UNROT PICK3 NOT | ||
+ | OR | ||
+ | END DROP2 | ||
+ | NEXT DROP2 | ||
+ | >> | ||
+ | </ | ||
+ | - HP 50g , 2.08, UserRPL 75mhz , #1 | ||
+ | * Time: 24.4003 sec | ||
+ | * Code: ((< | ||
+ | \<< | ||
+ | @k | ||
+ | 100 SWAP ^ @n:= 100^k | ||
+ | DUP \v/ @\v/n | ||
+ | 0 @old | ||
+ | 0 @new | ||
+ | 0 @limit | ||
+ | \-> | ||
+ | n | ||
+ | nsqr | ||
+ | old | ||
+ | new | ||
+ | limit | ||
+ | \<< | ||
+ | PUSH @saves system flags | ||
+ | -105 SF @approx mode | ||
+ | |||
+ | n 10 / | ||
+ | n 1 - | ||
+ | FOR a | ||
+ | a ' | ||
+ | 1 CF @the sequence is not convergent | ||
+ | 0 ' | ||
+ | WHILE | ||
+ | 1 FC? | ||
+ | limit n < | ||
+ | AND | ||
+ | REPEAT | ||
+ | @extracts middle digits | ||
+ | old old * @square | ||
+ | nsqr / IP @integer part of old^2/\v/n | ||
+ | n / FP @the previous result / n^2, picks the decimals | ||
+ | n * @restores the decimals as integers | ||
+ | ' | ||
+ | IF | ||
+ | new old == | ||
+ | THEN | ||
+ | 1 SF | ||
+ | ELSE | ||
+ | new ' | ||
+ | ' | ||
+ | END | ||
+ | END | ||
+ | NEXT | ||
+ | |||
+ | POP @restores system flags | ||
+ | \>> | ||
+ | \>> | ||
+ | |||
+ | executed with TEVAL | ||
+ | </ | ||
+ | |||
+ | ==== Results for n:=100^k, k=2 ==== | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, ARM ASM | ||
+ | * Source: [[http:// | ||
+ | * Time: 169.9734 s @75mhz | ||
+ | * refer to the same entry for k:=1 | ||
+ | - HP 50g , 2.15 with HPGCC3 patch, SaturnASM | ||
+ | * Source: [[http:// | ||
+ | * Time: 1445.8055 s @75mhz | ||
+ | * Code: see the entry for k=1. | ||
+ | - HP 50g , 2.08, UserRPL 75mhz , #1 | ||
+ | * Time: estimated 250' | ||
+ | \\ | ||
+ | Today i interrupted the computation after almost 48 hours and it is still unfinished as expected. \\ | ||
+ | **To avoid battery consumption you must use the an usb charger!**)) | ||
+ | * Code: The same of the entry "HP 50g , 2.08, UserRPL 75mhz , #1" for the 100^k, k=1 input. | ||
+ | |||
+ | ==== Results for n:=100^k, k=3 ==== | ||
+ | |||
+ | ==== Results for n:=100^k, k=4 ==== | ||
+ | |||
+ | ===== Emulators on mobile/ | ||
+ | Why? See the note ((Because, a part from school, one can use an handheld device to do math, maybe simple math using built-in functions of the framework, as well. Sure, it is not comfortable as a physical calculator unless it is large enough to simulate a calculator keyboard.)) . \\ | ||
+ | **>> | ||
+ | |||
+ | ==== Results for n:=100^k, k=1 ==== | ||
+ | - Palm treo pro GSM (400mhz, 128Mb ram), [[http:// | ||
+ | * Time: around 3 seconds (doing also other things, it is multitasking) | ||
+ | * <code python> | ||
+ | def middleSquareTest(k): | ||
+ | import time; | ||
+ | print time.localtime(); | ||
+ | n = 100**k; | ||
+ | nsqrt = n**(1/2.0); | ||
+ | temp = None; | ||
+ | new = None; | ||
+ | for a in range(n/10, n): | ||
+ | old = a; | ||
+ | for j in range(n): | ||
+ | #limit to max ' | ||
+ | temp = int(old*old / nsqrt)/ | ||
+ | new = (temp-int(temp))*n; | ||
+ | if new == old: | ||
+ | #exit | ||
+ | j = n; | ||
+ | else: | ||
+ | #print old; | ||
+ | old = new; | ||
+ | | ||
+ | print time.localtime(); | ||
+ | </ | ||
+ | |||
+ | ==== Results for n:=100^k, k=2 ==== | ||
+ | - Palm treo pro GSM (400mhz, 128Mb ram), [[http:// | ||
+ | * Time: 27'706 seconds (doing also other things, it is multitasking) | ||
+ | * See entry for k=1 | ||
+ | |||
+ | ==== Results for n:=100^k, k=3 ==== | ||
+ | |||
+ | ==== Results for n:=100^k, k=4 ==== | ||
+ | |||
+ | ===== Target devices of this test ===== | ||
+ | As Marcel Samek on 11 Sept 2013 wrote on MoHPC' |