# HP Calculator Wiki

### Site Tools

benchmarks:middlesquare

# Differences

This shows you the differences between two versions of the page.

 benchmarks:middlesquare [2017/05/07 01:20]pier4r benchmarks:middlesquare [2017/05/09 09:03] (current)pier4r Both sides previous revision Previous revision 2017/05/09 09:03 pier4r 2017/05/07 01:20 pier4r 2017/05/02 13:04 pier4r 2017/05/02 13:01 pier4r 2017/05/02 09:55 pier4r 2017/05/02 09:43 pier4r 2017/05/02 09:01 pier4r 2017/05/02 09:00 pier4r 2017/05/02 09:00 pier4r 2017/05/02 08:46 pier4r 2017/05/02 08:45 pier4r 2013/09/22 03:13 pier4r 2013/09/22 02:36 pier4r 2013/09/19 10:21 pier4r 2013/09/18 04:02 pier4r 2013/09/15 14:01 pier4r 2013/09/12 10:07 pier4r 2013/09/12 06:51 pier4r 2013/09/12 05:08 pier4r 2013/09/12 02:09 pier4r 2013/09/12 02:08 pier4r 2013/09/12 00:27 pier4r 2013/09/12 00:23 pier4r 2013/09/11 17:16 pier4r 2013/09/11 17:07 pier4r 2013/09/11 05:24 pier4r 2013/09/11 01:55 pier4r 2017/05/09 09:03 pier4r 2017/05/07 01:20 pier4r 2017/05/02 13:04 pier4r 2017/05/02 13:01 pier4r 2017/05/02 09:55 pier4r 2017/05/02 09:43 pier4r 2017/05/02 09:01 pier4r 2017/05/02 09:00 pier4r 2017/05/02 09:00 pier4r 2017/05/02 08:46 pier4r 2017/05/02 08:45 pier4r 2013/09/22 03:13 pier4r 2013/09/22 02:36 pier4r 2013/09/19 10:21 pier4r 2013/09/18 04:02 pier4r 2013/09/15 14:01 pier4r 2013/09/12 10:07 pier4r 2013/09/12 06:51 pier4r 2013/09/12 05:08 pier4r 2013/09/12 02:09 pier4r 2013/09/12 02:08 pier4r 2013/09/12 00:27 pier4r 2013/09/12 00:23 pier4r 2013/09/11 17:16 pier4r 2013/09/11 17:07 pier4r 2013/09/11 05:24 pier4r 2013/09/11 01:55 pier4r 2013/09/11 01:53 pier4r 2013/09/10 16:20 pier4r 2013/09/10 16:17 pier4r 2013/09/10 16:15 pier4r 2013/09/10 16:14 pier4r 2013/09/10 16:13 pier4r 2013/09/10 16:08 pier4r 2013/09/10 15:52 pier4r 2013/09/10 15:45 pier4r 2013/09/10 15:20 pier4r 2013/09/10 15:19 pier4r 2013/09/10 15:15 pier4r 2013/09/10 15:07 pier4r 2013/09/10 14:30 pier4r 2013/09/10 14:26 pier4r 2013/09/10 14:21 pier4r 2013/09/10 14:11 pier4r 2013/09/10 14:04 pier4r 2013/09/10 13:16 pier4r created Line 1: Line 1: + ​~~TALKPAGE~~​ + ====== Calculator benchmark: middle square method seed test ====== + + **The idea** + > In September 2013 I started to contribute on wiki4hp, and my first self-chosen task was: mine information from discussion places of HP calculators'​s community to report links to best/useful discussions and organize these by topics. One of those topic is: a general calculator benchmark for programmable calculators. As September 2013, I think that a lot of calculator benchmarks were discussed by the community, but only two has earned enough popularity: the //​[[benchmarks:​addloop|calculator add loop]]// and the //​[[benchmarks:​nqueens|nqueens (where n is 8) problem]]//​. ​ + > Benchmarks are not the ultimate test to measure the absolute computational power of a calculator (or a computer), but they give a good approximation of relative speed between calculators;​ that is: if a calculator A is faster than a calculator B in a benchmark, for computations that involves similar operations as in the benchmark, we can expect that the calculator A will take less time to finish them. + > + > Now, if we look at code of the two popular benchmark, we can see that the //​Calculator add loop// is really simple: just add one to a previous number, starting from 0, in the smartest way that you can code. The second is more complex but it involves: memory, subtractions and additions, and the "​n"​ is not variable, is fixed to 8. This is a great limitation for new calculators that are really fast. Unfortunately,​ with a greater n maybe some old calculators can't do anything (since they don't have enough memory), or maybe the community likes the number 8 since the problem is related to chess (that has a chessboard of 8x8). Anyway the benchmark can be extended (increasing the n value by fixed steps), but this can be tried later, hoping that the community will contribute (again). + > + > So thinking about a benchmark that use at least the four (+,-,*,/) operations, I stumbled upon the middle square method to produce, easily, random numbers((Even if the method has a lot of flaws from the point of view of randomness.)). Since we don't care about a particular benchmark algorithm, but we use the benchmark only to compare devices, it should be ok to test almost all programmable calculators,​ since it doesn'​t require much memory and it involves the four operations. + > + >So, if you agree (else, feel free to fork the idea on this wiki!) let's look the pseudocode. + + See also [[http://​www.hpmuseum.org/​forum/​thread-8287-post-72772.html#​pid72772|here(MoHPC)]] + + **Criticism**\\ + [[http://​www.hpmuseum.org/​cgi-sys/​cgiwrap/​hpmuseum/​archv021.cgi?​read=249996|MoHPC]] . Message #6. In short, a benchmark should be helpful to compare calculators performance on the same set of the instructions. While the middle square benchmark is more a programming challenge, because some calculators have more instructions than others or have other data types that allows faster/​slower/​more precise/​etc.. operations. + + So actually this should be seen as challenge, and only after that as an approximative benchmark. + + **The Pseudocode** \\ + As the two popular benchmarks shows, we don't test only the computational power of the devices, ​ + but also the programming ability of the user to "​trick"​ the game. + <​code>​ + k:= a positive integer + n:= 100^k + For i:=(n/10) to (n-1) do { + old := i + sequenceIsConvergent := false + limitCyclicSequences := 0 //because we can end in cyclic sequences and we don't want to use a lot of memory! + while ( (NOT sequenceIsConvergent) AND (limitCyclicSequences < n) ) { + new = extract_middle_number from old^2 (see below) + if new == old { + sequenceIsConvergent = true + } + else { + old := new + limitCyclicSequences = limitCyclicSequences+1;​ + }    ​ + }//end while + }//end for + + ->​Result:​ the time spent to finish the computation given the starting k value. + + How does extract_middle_digits work? + Given n, that is a power of 10 of the type 10^d, + its square is equal to 10^(d*2) and it has (d*2+1) digits. + + We consider only the last (d*2) digits. ​ + That is: if n=10000, 10000*10000 = 100000000 we consider only 00000000 without the first 1. + + Then we '​consider'​ all the numbers lower than 10^(d*2) with (d*2) digits. ​ + For example if d=4 and the number under process is 1542, + we consider 1542^2 as 02377764 instead of 2377764. + + Why d=4 ? Because we use as seed, with n=10000, numbers from 1000 to 9999, so numbers having 4 digits. + + After that we pick the d=4 middle digits, from 02377764 we pick 02[3777]64. + So the middle number extracted is 3777. + ​ + + The method of extraction for the digits can be anything you want, a standard method involving modulo and divisions or some other method. You can use an iterative subtraction to get the modulo instead of using the calculator'​s built-in function to compute it. Here we will see the user ability to exploit all the possibilities of the calculator, hoping that even algorithms with "​classic"​ operations will be reported for a comparison. + + **More details!** + <​code>​ + Let k=1 , so we have as reference n:= 100^k = 100 + Do sequences for all the seeds from a:=n/10 to n-1. + So the seeds are {a:​=10,​11,​12,​...,​99} + + How do we compute a sequence? + a:=10 + seed:=a + 1. we first square the number + 10*10 = 100 + 2. 100 squared is 10000, with 4 zeroes, then we consider the previous number with four digits (with a leading zero) + 0100 + 3. Then we pick the middle digits of the number, as many as the zeroes of n (that is 100), the new seed is + 0[10]0 + 4. The new seed is the same as the initial seed, then if we reapply the process we end up always with 10. Therefore the sequence is convergent. + + Next step + a:=11 + seed:=a + seed^2 -> 121 + See it with 4 digits -> 0121 + Pick middle digits -> 0[12]1 , seed:= 12 + seed^2 -> 144 + See it with 4 digits -> 0144 + Pick middle digits -> 0[14]4, seed:=14 + seed^2 -> 196 + See it with 4 digits -> 0196 + Pick middle digits -> 0[19]6, seed:=19 + seed^2 -> 361 + See it with 4 digits -> 0361 + Pick middle digits -> 0[36]1, seed:=36 + seed^2 -> 1296 + See it with 4 digits -> 1296 + Pick middle digits -> 1[29]6, seed:=29 + seed^2 -> 841 + See it with 4 digits -> 0841 + Pick middle digits -> 0[84]1, seed:=84 + seed^2 -> 7056 + See it with 4 digits -> 7056 + Pick middle digits -> 7[05]6, seed:=5 + seed^2 -> 25 + See it with 4 digits -> 0025 + Pick middle digits -> 0[02]5, seed:=2 + seed^2 -> 4 + See it with 4 digits -> 0004 + Pick middle digits -> 0[00]4, seed:=0 + seed^2 -> 0 + See it with 4 digits -> 0000 + Pick middle digits -> 0[00]0, seed:=0 + Here the new seed is the same of the previous, so the process stops + + let's skip some iterations to show another problem + a:=24 + seed:=a + seed^2 -> 576 + See it with 4 digits -> 0576 + Pick middle digits -> 0[57]6, seed:=57 + seed^2 -> 3249 + See it with 4 digits -> 3249 + Pick middle digits -> 3[24]9, seed:=24 + + At this time you can do an infinite loop that is cyclic, + so a way is to limit any sequence to max 100^k iterations. + + In this case one could consider the sentence "For a generator of n-digit numbers the period can be no longer than 8^n" on wikipedia, but to make it standard, let's limit it to 100^k iterations. + ​ + + **How to report a result** \\ + <​code>​ + A result is composed by the following list + - the device used plus the language used, overclock if any, custom firmware if any and so on. + - time elapsed for a given k in seconds (see below) + - the code used. + + if the calculator is too slow, or limited, to compute a given n, then report "for n=100^k the computation + takes too much time". Conversely, if the calculator is too fast to compute a given n, then report ​ + "for n=100^k the computation takes too little time, I skipped it" + ​ + + ===== Physical calculators ===== + + ==== Results for n:=100^k, k=1 ==== + - HP 50g , 2.15 with HPGCC3 patch, ARM ASM + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58809|3298 on Casio forum]] + * Time: 0.0093 s @75mhz + * Notes: For the code, I obviously needed division. This time, I searched for proper implementations of division on the web and found [[http://​infocenter.arm.com/​help/​index.jsp?​topic=/​com.arm.doc.dui0040d/​ch05s08s02.html|this]] one in the official infocenter. It is pretty scary how close I got with my own modulo routine, so it was quite optimized after all. Not many processors allow the programmer to write such a compact version (''​CMP R7,R8; SUBNES R6,R6,1; BNE WHILEloop''​) of ''​if(R7!=R8 && (--R6)!=0) goto WHILEloop;''​ + * Code: ((<​code>​ + CODE + A=0.W + GOSBVL POP# + GOSBVL SAVPTR + SKUB { + *start + !ARM + STMDB sp! {R4 R5 R6 R7 R8 R9 R10 LP } + LDR R2,​[R1,#​2316] + MOV R3,1 + MOV R10,10 + *ALOGloop + MUL R3,R3,R10 + SUBS R2,R2,1 + BNE ALOGloop + MUL R2,R3,R3 + MOV R7,R2 + BL divmod + MOV R4,R9 + *FORloop + MOV R5,R4 + MOV R6,R4 + *WHILEloop + MUL R,R5,R5 + MOV R10,R3 + BL divmod + MOV R7,R9 + MOV R10,R2 + BL divmod + MOV R8,R5 + MOV R5,R7 + CMP R7,R8 + SUBNES R6,R6,1 + BNE WHILEloop + ADD R4,R4,1 + CMP R4,R2 + BNE FORloop + LDMIA sp! {R4 R5 R6 R7 R8 R9 R10 PC} + *divmod + MOV R8,R10 + MOV R9,0 + CMP R7,R10 + MOVLO PC,LR + *.LSloop + MOV R8,R8 LSL #1 + CMP R8,R7 + BLO .LSloop + *.RSloop + MOVHI R8,R8 LSR #1 + CMP R7,R8 + SUBHS R7,R7,R8 + ADC R9,R9,R9 + CMP R8,R10 + BHI .RSloop + MOV PC,LR + !ASM + *end + } + C=RSTK + D0=C + D1=80100 + LC(5)end-start + MOVEDN + LC 80100 + ARMSAT + GOVLNG GETPTRLOOP + ENDCODE + ​)) + - HP 50g , 2.15 with HPGCC3 patch, SaturnASM + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58800|3298 on Casio forum]] + * Time: 0.0359 s @75mhz + * Notes: just with Saturn ASM, though I have to admit that I used some instructions the real Saturn didn't have - only the emulated Saturn can multiply and divide with a single instruction. So what will ARM ASM bring? \\ Another note on optimization in this code: I let the limitCyclicSequences counter run backwards from n to 0, this saves an instruction or two in the WHILE loop compared to the pseudocode version. + * Code: ((<​code>​ + CODE + GOSBVL POP# + GOSBVL SAVPTR + B=A.A + C=0.W + A=0.W + C+1.W + A+10.W + *ALOGloop + C*A.W + B-1.W + ?B#0.A + GOYES ALOGloop + R0=C.W + C*C.W + R1=C.W + C/A.W + R2=C.W + *FORloop + B=C.W + D=C.W + *WHILEloop + D-1.W + ?D=0.W + GOYES endFORloop + C*C.W + A=R0.W + C/A.WA=R1.W + C%A.W + A=B.W + B=C.W + ?A#C.W + GOYES WHILEloop + *endFORloop + C=R2.W + A=R1.W + C+1.W + R2=C.W + ?C#A.W + GOYES FORloop + GOVLNG GETPTRLOOP + ENDCODE + ​)) + - HP 50g , 2.15 with HPGCC3 patch, SysRPL + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58798|3298 on Casio forum]] + * Time: 1.1421 s @75mhz + * Notes: Due to the size of the numbers this test produces I can't use normal bints if I want to run it for k=2. As a consequence,​ my first try was to translate my UserRPL version to SysRPL using HXS numbers. That one got k=1 solved in about 4 seconds with hardcoded constants n and n² for k=1. When I replaced them by a piece of code that was supposed to calculate them from user input, it was for some reason slowed down to about 12 seconds (yes, slower than my previous UserRPL version). I could have done a version that is slightly faster than the UserRPL version by using real numbers as well, but as the highest number calculated for k=1 is still below the limit of 1048576, I can safely use bints for that one. + * Code: ((<​code>​ + :: + BINT10 BINT100 + BINT100 BINT10 DO + DUPINDEX@ BEGIN + DUPDUP #* 5PICK #/ SWAPDROP 4PICK #/ DROP + SWAPOVER #= + ROT #1- DUP4UNROLL #0= + OR + UNTIL 2DROP + LOOP 2DROP + ; + ​)) + - HP 50g , 2.15 HRAST BASIC + * Source: [[http://​www.hpmuseum.org/​forum/​thread-8287-post-72903.html#​pid72903|On MoHPC forum]] + * Time: 1.94s + * Code: ((<​code>​ + WATCH INTEGER K=1,​N=100^K,​Q=SQR N + FOR I=N/10 TO N-1 O=I FOR L=N R=MOD(^O/​Q,​N) IF O<>R@ O=R NEXT ELSE LEAVE + NEXT ? TICKS + ​)) + - HP 49g, HRAST BASIC + * Source: [[http://​www.hpmuseum.org/​forum/​thread-8287-post-72903.html#​pid72903|On MoHPC forum]] + * Time: 4.81s + * Code: ((<​code>​ + WATCH INTEGER K=1,​N=100^K,​Q=SQR N + FOR I=N/10 TO N-1 O=I FOR L=N R=MOD(^O/​Q,​N) IF O=R LEAVE ELSE O=R NEXT + NEXT ? TICKS + ​)) + - HP 50g , 2.15 with HPGCC3 patch, UserRPL + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58798|3298 on Casio forum]] + * Time: 10.6349 s @75mhz + * Code: ((<​code>​ + << + ALOG DUP SQ + DUP 10. / OVER 1. - FOR a + DUP a DO + DUP SQ 5. PICK / IP 4. PICK MOD + UNTIL + SWAP OVER == + ROT 1. - UNROT PICK3 NOT + OR + END DROP2 + NEXT DROP2 + >> + ​)) + - HP 50g , 2.08, UserRPL 75mhz , #1 + * Time: 24.4003 sec + * Code: ((<​code>​ + \<< + @k + 100 SWAP ^ @n:= 100^k + DUP \v/ @\v/n + 0 @old + 0 @new + 0 @limit + \-> + n + nsqr + old + new + limit + \<< + PUSH @saves system flags + -105 SF @approx mode + + n 10 / + n 1 - + FOR a + a '​old'​ STO + 1 CF @the sequence is not convergent + 0 '​limit'​ STO @to limit cyclic sequences + WHILE + 1 FC? + limit n < + AND + REPEAT + @extracts middle digits + old old * @square + nsqr / IP @integer part of old^2/\v/n + n / FP @the previous result / n^2, picks the decimals + n * @restores the decimals as integers + '​new'​ STO + IF + new old == + THEN + 1 SF + ELSE + new '​old'​ STO + '​limit'​ 1 STO+ + END + END + NEXT + + POP @restores system flags + \>> + \>> + + executed with TEVAL + ​)) + + ==== Results for n:=100^k, k=2 ==== + - HP 50g , 2.15 with HPGCC3 patch, ARM ASM + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58809|3298 on Casio forum]] + * Time: 169.9734 s @75mhz + * refer to the same entry for k:=1 + - HP 50g , 2.15 with HPGCC3 patch, SaturnASM + * Source: [[http://​community.casiocalc.org/​topic/​7183-help-needed-calculator-benchmark/​page__st__40#​entry58800|3298 on Casio forum]] + * Time: 1445.8055 s @75mhz + * Code: see the entry for k=1. + - HP 50g , 2.08, UserRPL 75mhz , #1 + * Time: estimated 250'​000 secs ((Because the complexity of the task for the input k=1 is around $O(100 \cdot 100=100^2)$ and it takes 24.4 secs. The complexity of the task with the input k=2 is around $O(100^2 \cdot 100^2 = 100^4)$ that is $100^2$ times the complexity with k=1. So, while the Hp50g is still crunching for a confirmation,​ the task will take around 250'​000 seconds. \\ + \\ + Today i interrupted the computation after almost 48 hours and it is still unfinished as expected. \\ + **To avoid battery consumption you must use the an usb charger!**)) + * Code: The same of the entry "HP 50g , 2.08, UserRPL 75mhz , #1" for the 100^k, k=1 input. + + ==== Results for n:=100^k, k=3 ==== + + ==== Results for n:=100^k, k=4 ==== + + ===== Emulators on mobile/​handheld device OR mobile/​handheld devices itself ===== + Why? See the note ((Because, a part from school, one can use an handheld device to do math, maybe simple math using built-in functions of the framework, as well. Sure, it is not comfortable as a physical calculator unless it is large enough to simulate a calculator keyboard.)) . \\ + **>>​Devices smaller than a tablet of 7 inches, 7 inches included<<​** + + ==== Results for n:=100^k, k=1 ==== + - Palm treo pro GSM (400mhz, 128Mb ram), [[http://​pythonce.sourceforge.net/​|pythonCE 2.5-20061219]] #1 + * Time: around 3 seconds (doing also other things, it is multitasking) + * + def middleSquareTest(k):​ + import time; + print time.localtime();​ + n = 100**k; + nsqrt = n**(1/2.0); + temp = None; + new = None; + for a in range(n/10, n): + old = a; + for j in range(n): + #limit to max '​n'​ iterations because we don't want unlimited cycles + temp = int(old*old / nsqrt)/​float(n) ; + new = (temp-int(temp))*n;​ + if new == old: + #exit + j = n; + else: + #print old; + old = new; + ​ + print time.localtime();​ + ​ + + ==== Results for n:=100^k, k=2 ==== + - Palm treo pro GSM (400mhz, 128Mb ram), [[http://​pythonce.sourceforge.net/​|pythonCE 2.5-20061219]] #1 + * Time: 27'706 seconds (doing also other things, it is multitasking) + * See entry for k=1 + + ==== Results for n:=100^k, k=3 ==== + + ==== Results for n:=100^k, k=4 ==== + + ===== Target devices of this test ===== + As Marcel Samek on 11 Sept 2013 wrote on MoHPC'​s forum, this test is hard for small programmable calculators that can't play with integer of fractional part of a number. So the taget devices of this test are high end calculators with good mathematical framework.