From mk Sat Jun 30 13:06:33 2001
Received: from mast.cwi.nl (mast.cwi.nl [192.16.196.89]) by hera.cwi.nl with ESMTP
	id NAA15285 for <martin.kersten@mail.cwi.nl>; Sat, 30 Jun 2001 13:06:32 +0200 (MEST)
Received: from echolood.cwi.nl (IDENT:root@echolood.cwi.nl [192.16.196.65])
	by mast.cwi.nl (8.11.2/8.9.3/FLW-3.11M) with ESMTP id f5UB6Wq13328
	for <Martin.Kersten@cwi.nl>; Sat, 30 Jun 2001 13:06:32 +0200
From: Martin Kersten <Martin.Kersten@cwi.nl>
Received: (from mk@localhost)
	by echolood.cwi.nl (8.11.0/8.9.3/FLW-3.2C) id f5UB6WI24560
	for Martin.Kersten@cwi.nl; Sat, 30 Jun 2001 13:06:32 +0200
Message-Id: <200106301106.f5UB6WI24560@echolood.cwi.nl>
Subject: Re: performance
To: Martin.Kersten@cwi.nl (Martin Kersten)
Date: Sat, 30 Jun 2001 13:06:31 +0200 (CEST)
In-Reply-To: <200106292337.f5TNbhl29637@echolood.cwi.nl> from "Martin Kersten" at Jun 30, 2001 01:37:43 AM
X-Mailer: ELM [version 2.5 PL3]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Status: RO


Forwarded message:
> 
> here the results of the little experiment of this afternoon,
> after trying to answer the question "why does it take 0.8 sec
> to start Mserver on Stefan's machine compared to 2.7 sec on
> Martin's"
> 
> On Stefan's machine (dual 550Mhz)
> 	V4-3 -O6	V4.3 -O2	V5.0 -O2
> base	0.8/0.2/0.6	0.8/0.2/0.5	0.8/0.2/0.5
> tst400a	7.7/6.5/0.6	7.0/6.4/0.5	1.8/1.2/0.5
> tst400b	3.3/2/5/0.8	3.0/2.4/0.6	1.6/0.7/0.8
> tst400c	14.2/13.3/0.9	14.5/13/9/0.5	1.9/0.7/0.9
> tst400d	9.2/8.6/0.5	10.1/9.3/0.7	0.8/0.1/0.6
> 
> On Martin's machine (450 Mhz)
> 			V4.3 -O2	V5.0 -O2
> base			2.7/0.3/2.4	2.6/0.25/2.3
> tst400a		8.7/6.3/2.4	3.5/1.2/2.3
> tst400b 		5.0/2.5/2.5	3.2/0.7/2.5
> tst400c		5.6/3.0/2.6	3.2/0.6/2.6
> tst400d		3.0/0.5/2.5	2.6/0.2/2.4
> 
> Observations:
> - original question not answered
> - tst c+d involve b.insert(1,1), which uses a simplified
>   reference counting scheme for bats (no BBPfix() in module interface)
> - synchronization seems really expensive on the dual
> - handling empty  1M loop in M4  doesn't align with tst400d loop
> - new parser is quite effective
> 

The situation on June 30, 2001, calling from home

On Martin's machine (450 Mhz)
		V5.0 -O2
base		2.6/0.2/2.4		
tst400b 	2.9/0.5/2.4		10k count
tst400c		3.0/0.5/2.5		10k insert
tst400d		2.6/0.2/2.4		100K insert tuple
tst400e		7.0/3.6/2.4		1M mal call increment function

Times at 25/8/01
		echolood(450Mz load 1.5)	Version 4.3
tst400a		 3.3/0.9/2.4	1M loop	 	8.7/ 6.2/2.4
tst400bHuge 	13  /7.3/5.8	100k count (io)	24.8/11.9/4.1
tst400cHuge	12.3/6.5/5.6	100k insert(io)	22.0/16.0/3.6
tst400d		 4.9/2.3/2.6	1M insert tuple 31.8/26.5/4.8
tst400e		 5.2/2.8/2.4	1M incr fcn    	16.2/12.2/2.4

		gaffel (2x400Mhz Celeron load 0.0)	Version 4.3
tst400a		 1.5/0.9/0.6	1M loop	 	7.1/ 6.6/0.5
tst400bHuge 	13  /8.2/4.6	100k count (io)	14.6/13.0/1.5
tst400cHuge	12  /8.2/4.8	100k insert(io)	17.6/15.5/2.1
tst400d		 3.4/2.5/0.9	1M insert tuple 33.0/29.4/3.6
tst400e		 3.7/3.0/0.6	1M incr fcn    	13.1/12.6/0.5

		bezaan ( Athlon load 1.5)	Version 4.3
base		11.8/0.4/5.6			0m13/0.9/5.7
tst400a		22.5/5.1/6.0	1M loop	 	1m21/38/5.6
tst400bHuge 	1m28/32/15	100k count (io)	2m02/56/8.1
tst400cHuge	1m30/31/15	100k insert(io)	2m12/1m01/9.4
tst400d		0m38/13/7	1M insert tuple 4m02/1m53/13.9
tst400e		0m36/16/6	1M incr fcn    	2m20/1m06/5.7

		thor (XEON load 0.0)		Version 4.3
base		0.5/0.2/0.3			0.9/0.3/0.4
tst400a		1.1/0.7/0.4	1M loop	 	5.1/4.7/0.4
tst400bHuge 	6.4/4.3/1.9	100k count (io)	10.5/9.6/0.9
tst400cHuge	6.4/4.1/2.0	100k insert(io)	12.8/11.4/1.4
tst400d		2.4/1.7/0.6	1M insert tuple 25.0/22.4/2.5
tst400e		2.7/2.2/0.4	1M incr fcn    	9.6/9.2/0.4

Times as of 17-09-01 
		echolood(450Mz load 0)	Version 4.3
base		 2.5/0.2/2.3
tst400a		 3.0/0.7/2.3	1M loop	 	8.7/ 6.2/2.4
tst400a2	 3.4/1.0/2.3	1M loop inc
tst400bHuge 	 4.8/2.4/2.4	100k count (io)	24.8/11.9/4.1
tst400cHuge	 5.4/2.9/2.5	100k insert(io)	22.0/16.0/3.6
tst400d		 4.5/1.9/2.5	1M insert tuple 31.8/26.5/4.8
tst400e		 4.9/2.6/2.3	1M incr fcn    	16.2/12.2/2.4

Multiplex operations
We assume a bat[lng,lng] with 1M elements in increasing order.
tst901a contains a fast implementation assuming aligned bats
in both V5 and V4.3

tst901b uses the /MIL generic loop to achieve the same result
For V5 a C-version was also constructed, which assumed a
code generation technique based on translation of each individual
MAL statement to the relevant C-statements. No inter-statement
analysis was performed. It represents what would be achieved
by a 'naive' expansion to a C-function.
The running time for this function was 3.6 seconds.

Compiled version as part of batcalc runs in 3.6sec

tst901a		9.4/5.6/3.6	[+] call	35.7/30.4/5.0
	make	5.7				32.4
	multiplex 0.31				 0.35

tst901b	20.6/16.3/4.3		explicit loop	82.8/68.5/7.3
	make	   5.5				41.5
	multiplex 14.6				59.6


Multiplex library calls,
Assume bats with 1M unique elements, all results in milliseconds
              Monet V5.0                      Monet V4.3
tst901d         [:+=]   [dbl]   [log]   [==]    [:+=]   [dbl]   [log]   [==]
[void,lng]      110     240     630             144     960     720     220
                110     258     650     234     130     967     727     224
                108     253     657     240     130     965     726     225
[lng,lng]       140     280     710             143     1020    880     345
                142     300     696     381     164     1136    974     402

The operation [+](lng,lng) costs 180 and 346 micro seconds, respectively.
The BATcopy underlying this operation takes: 141 and 269 ms.
Although [+] == batcopy followed by [:+=], this is not cost effective
(180 ms versus 141+140)

C- implementation of a trivial iterator, compile time 0.2 sec
[      1, 639855.00]		profiler.setAll();
[      1, 99.00]		b := mal.new(lng,lng);
[      1, 26.00]		t0 := system.usec();
[      1,  3.00]		i := -1;
[      1,  3.00]		t0 := system.usec();
[1000001,  2.56]	barrier	v := mal.nextElement(i,0,1000000);
[1000000,  6.97]		bat.insert(b,i,i);
[1000000,  2.13]		redo v ;
[      1,  5.00]		t1 := system.usec();
[      1, 118475.00]		c := algebra.copy(b);
[      1,  7.00]		t2 := system.usec();
[      1, 111.00]		d := mal.new(lng,lng);
[      1,  5.00]		h := 0;
[      1,  3.00]		t := 0;
[1000001,  4.55]	barrier	mloop := mal.bunStream(b,h,t);
[1000000,  5.55]		B2 := algebra.find(b,h);
[1000000,  4.83]		B3 := algebra.find(c,h);
[1000000,  2.38]		cr := calc.+(B2,B3);
[1000000,  7.85]		bat.insert(d,h,cr);
[1000000,  2.24]	catch	GDKerror;
[1000000,  2.30]		redo mloop ;
[      1,  5.00]		t3 := system.usec();
[      1, 3714520.00]		batcalc.CMD000(b,b);
[      1,  8.00]		t4 := system.usec();
[      1,  5.00]		d1 := calc.-(t1,t0);
[      1,  3.00]		d2 := calc.-(t3,t2);
[      1,  3.00]		d3 := calc.-(t4,t3);
[      1, 39.00]		cnt := aggr.count(d);
[      1, 123.00]		system.printf("cnt %d ",cnt);
[      1, 19.00]		system.printf("make %d ",d1);
[      1, 79.00]		system.printf("multiplex %d\n",d2);
[      1, 87.00]		system.printf("compiled multiplex %d\n",d3);
[      1,  2.00]	end function;

--------------- MARCH 2002 ---------------------------
Just before we are moving to the Athlon desktops
Calling convention with -monetrc .../monet.conf
Both servers use a 'hot' measurements on quite machine

Times as of 22-3-03
		echolood(450Mz load 0)	Version 4.3
base		 0.7/0.3/0.4			0.7/ 0.3/0.4
tst400a		 1.3/0.9/0.4	1M loop	 	8.2/ 7.9/0.4
tst400a2	 1.3/1.3/0.4	1M loop inc
tst400bHuge 	 3.7/3.0/0.5	100k count (io)	17.6/15.8/1.3
tst400cHuge	 3.8/3.2/0.4	100k insert(io)	22.0/18.0/1.6
tst400d		 6.5/5.1/1.4	1M insert tuple 38.2/34.7/2.8
tst400e		 3.8/3.4/0.4	1M incr fcn    	15.6/15.2/0.4

tst901a		8.3/6.7/1.5	[+] call	39.4/36.3/2.9
tst901b		27.4/23.4/3.6	[+] call	119.9/112.4/6.3

mk@gaffel::~/monet_5-0/src/mal/Tests> time Mserver -monetrc ../../../monet.conf  tst400a </dev/null >/dev/null

		gaffel(2x400Mhz Celeron load 0.5)	Version 4.3
base		 0.97/0.4/0.53			1.0/ 0.4/0.57
tst400a		 1.6/1.0/0.6	1M loop	 	9.7/ 9.2/0.5
tst400a2	 2.1/1.6/0.5	1M loop inc
tst400bHuge 	 4.3/3.5/0.7	100k count (io)	18.7/16.9/1.7
tst400cHuge	 4.6/3.7/0.8	100k insert(io)	22.6/20.5/2.1
tst400d		 7.7/5.9/1.7	1M insert tuple 39.2/36.2/3.5
tst400e		 4.5/3.8/0.5	1M incr fcn    	17.7/17.2/0.5

tst901a		9.2/6.8/2.4	[+] call	40.4/36.6/3.7
tst901b		27.0/23.4/3.5	[+] call	118.8/110.5/7.4

Overall picture is a loss of performance of about 25% on
both sides compared to Sept 2001. Since V5 and V4.3 use the same
GDK stuff, I suspect this to be a result of Linux kernel changes.

Outlier are cHuge and tst400d. Both test insertion in a BAT.
The overall loss is in the order of 10%. Furthermore, tst400d
on V5 illustrates unexpected behavior. 
tst400d: changing the BBPfix in CMDinsert into a direct update of the reference
        count saved 0.7 usr sec. In combination with BBPunfix part of the
        garbagecollector being called after before each instruction this
        means that 25% of the time the test is involved in locking
        activity. It seems that locking has become significantly more
        expensive. (this might also explain the slow start of gdb on
        Mserver)

The chains in the symbol table have become quite large(mdb->scope)
Forexample, reaching +() requires passing 15 modules and 35 signature test
on average.
If you identify the module calc.+() you skip one module for inspection
and save around 300 ms/100K calls in tst400bHuge.
If you could 'jump' to the correct signature within the module then
you would save another 300ms. This means that type resolution in tst400bHuge
takes about 20% of the resources.

insert() requires 16 module passes and 63 (!) signature comparisons on avg.
This leads to a difference in tst400cHuge of 970 ms/100calls about 25% of test

Minor update to source (before/after/final- hooks)
and better stack preparation in mal_interpeter
tst400a		 1.2/0.8/0.4	1M loop	 	
tst400a2	 1.6/1.2/0.4	1M loop inc
tst400bHuge 	 3.3/2.8/0.5	100k count (io)	
tst400cHuge	 3.4/2.8/0.6	100k insert(io)	
tst400d		 6.6/5.4/1.2	1M insert tuple 
tst400e		 3.0/2.6/0.4	1M incr fcn    	
tst901a		 7.6/5.9/1.7	[+] call	
tst901b		24.2/21.1/3.0	[+] call	

The first hour of the dual Athlon	and V4.3 compiled with -O2!!!
		orion(2x1.4Mh load 0)	
base		 0.3/0.1/0.2			0.29/0.11/0.18
tst400a		 0.45/0.3/0.15	1M loop	 	2.0/1.89/0.14
tst400a2	 0.6/0.4/0.2	1M loop inc	
tst400bHuge 	 1.26/1.0/0.22	100k count (io)	4.1/3.90/0.27
tst400cHuge	 1.3/1.0/0.28	100k insert(io)	4.5/4.11/0.47
tst400d		 1.4/1.0/0.4	1M insert tuple 7.0/6.1/0.88
tst400e		 1.1/0.9/0.2	1M incr fcn     3.8/3.63/0.21

tst901a		 1.9/1.5/0.4	[+] call	7.1/6.19/0.91
tst901b		 5.9/5.4/0.5	[+] call	19/17.8/1.64

Alignment of the compilation flags illustrates 
that V5 is about 4 times faster on the tests
(previous tests indicate 3.3-7 (17/9) and 3.9-5 (22/3)

======================== 15 March 2003 ===================

The compilation mode for both V5 and V4.3.7 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)	
base		 0.33/0.13/0.20			0.28/0.8/0.20
tst400a		 0.49/0.29/0.19	1M loop	 	2.0/1.84/0.18
tst400a2	 0.62/0.44/0.18	1M loop inc	
tst400bHuge 	 1.47/1.2/0.24	100k count (io)	5.3/4.90/0.44
tst400cHuge	 1.55/1.3/0.24	100k insert(io)	6.0/5.60/0.42
tst400d		 1.4/1.1/0.3	1M insert tuple 7.3/6.6/0.68
tst400e		 1.2/1.0/0.2	1M incr fcn     3.7/3.6/0.19

tst901a		 1.9/1.5/0.4	[+] call	7.4/6.7/0.7
tst901b		 5.5/5.0/0.5	[+] call	20/18.5/1.64
======================== 13 April 2003 ===================

The compilation mode for both V5 and V4.3.7 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)	
base		 0.33/0.13/0.20			0.28/0.8/0.20
tst400a		 0.52/0.31/0.20	1M loop	 	2.0/1.84/0.18
tst400a2	 0.65/0.44/0.19	1M loop inc	
tst400bHuge 	 1.10/0.85/0.24	100k count (io)	5.3/4.90/0.44
tst400cHuge	 1.32/1.08/0.24	100k insert(io)	6.0/5.60/0.42
tst400d		 1.36/1.07/0.29	1M insert tuple 7.3/6.6/0.68
tst400e		 1.32/1.12/0.2	1M incr fcn     3.7/3.6/0.19

tst901a		 1.9/1.3/0.6	[+] call	7.4/6.7/0.7
tst901b		 4.3/4.0/0.26	[+] call	20/18.5/1.64
======================== 20 July 2003 ===================

The compilation mode for both V5 and V4.3.7 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)	
base		 0.29/0.14/0.16			
tst400a		 0.50/0.31/0.18	1M loop	 	
tst400a2	 0.59/0.42/0.17	1M loop inc	
tst400bHuge 	 1.16/0.93/0.23	100k count (io)	
tst400cHuge	 1.43/1.13/0.27	100k insert(io)	
tst400d		 1.35/1.06/0.29	1M insert tuple 
tst400e		 1.33/0.97/0.16	1M incr fcn     

tst901a		 1.8/1.4/0.4	[+] call	
tst901b		 4.4/4.1/0.27	[+] call	
======================== 29 May 2004 ===================
Same setup on Orion
The compilation mode for both V5 and V4.3.17 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)	
base		 0.10/0.08/0.02			 0.80/0.14/0.68
tst400a		 0.25/0.23/0.02	1M loop	 	 2.80/2.14/0.66
tst400a2	 0.37/0.35/0.02	1M loop inc	
tst400bHuge 	 0.83/0.77/0.05	100k count (io)	11.69/10.77/0.85
tst400cHuge	 0.97/0.83/0.14	100k insert(io)	 5.25/ 4.21/0.98
tst400d		 0.73/0.72/0.01	1M insert tuple  9.04/ 8.34/0.67
tst400e		 1.02/1.01/0.01	1M incr fcn      4.66/ 4.01/0.60

tst901a		 0.98/0.93/0.05	[+] call	 8.95/ 8.24/0.68
tst901b		 4.26/4.13/0.13	[+] call	23.75/23.07/0.65
======================== July 5, 2004 ===================
Same setup on Orion
The compilation mode for both V5 and V4.3.17 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)	
base		 0.10/0.08/0.02			 
tst400a		 0.20/0.18/0.02	1M loop	 	 
tst400a2	 0.33/0.32/0.01	1M loop inc	
tst400bHuge 	 0.77/0.70/0.06	100k count (io)	
tst400cHuge	 1.02/0.89/0.13	100k insert(io)	 
tst400d		 0.70/0.67/0.03	1M insert tuple  
tst400e		 1.02/1.01/0.01	1M incr fcn      

tst901a		 0.85/0.80/0.05	[+] call	 
tst901b		 4.12/4.07/0.05	[+] call	
======================== 7 Oct 2004 ===================
Setup on Orion with Fedora 2
The compilation mode for both V5 and V4.4 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		orion (V4.4)	
base		 0.19/0.10/0.08			 1.76/0.30/1.45
tst400a		 0.30/0.20/0.10	1M loop	 	 3.95/2.45/1.48
tst400a2	 0.42/0.33/0.09	1M loop inc	
tst400bHuge 	 0.90/0.74/0.15	100k count (io)	13.01/11.06/1.95
tst400cHuge	 1.16/0.93/0.22	100k insert(io)	 6.40/ 4.31/2.07
tst400d		 0.75/0.64/0.11	1M insert tuple  10.32/ 8.76/1.55
tst400e		 1.30/1.29/0.10	1M incr fcn      5.78/ 4.27/1.49

tst901a		 1.04/0.80/0.23	[+] call	 10.24/ 8.61/1.59
tst901b		 4.05/3.83/0.21	[+] call	26.04/24.72/1.72
======================== 7 Oct 2004 ===================
Setup on Orion with Fedora 2, and on P3 350Mhz with Fedora 1
The compilation mode for both V5 and V4.4 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		P3 350 V4.4
base		 0.19/0.10/0.08			 2.99/0.50/1.33	
tst400a		 0.30/0.20/0.10	1M loop	 	 9.71/8.19/1.45
tst400a2	 0.42/0.33/0.09	1M loop inc	
tst400bHuge 	 0.90/0.74/0.15	100k count (io)	55.38/37.87/2.71
tst400cHuge	 1.16/0.93/0.22	100k insert(io)	55.48/18.01/4.91
tst400d		 0.75/0.64/0.11	1M insert tuple 40.58/38.16/1.50
tst400e		 1.30/1.29/0.10	1M incr fcn     17.99/16.56/1.36

tst901a		 1.04/0.80/0.23	[+] call	 40.36/38.59/1.47
tst901b		 4.05/3.83/0.21	[+] call	112.45/110.42/1.73
======================== 15 Oct 2004 ===================
Setup on Shuttle with Fedora 2, and on AMD 
The compilation mode for both V5 and V4.4 is CFLAGS=-O2
command: time Mserver -c perf.conf TST </dev/null >/dev/null
                Version 4.99			Version 4.4
base             0.06/0.03/0.02                  0.06/0.03/0.01
tst400a          0.10/0.07/0.03 1M loop          0.70/0.69/0.01
tst400a2         		1M loop inc
tst400bHuge      0.30/0.25/0.05 100k count (io)  4.52/4.33/0.18
tst400cHuge      0.41/0.34/0.07 100k insert(io)  1.71/1.51/0.21 
tst400d          0.26/0.22/0.03 1M insert tuple  2.80/2.78/0.02
tst400e          0.43/0.41/0.02 1M incr fcn      1.45/1.44/0.01

tst901a          0.32/0.25/0.06 [+] call         2.76/2.72/0.03
tst901b          1.29/1.22/0.07 [+] call         7.44/7.39/0.05

======================== 16 Oct 2004 ===================
The situation on Ara with a new kernel
The compilation mode for both V5 and V4.4 is CFLAGS=-O2
command: time Mserver -c perf.conf TST </dev/null >/dev/null
                 V4.99 				V4.4
base             0.30/0.10/0.05                  1.44/0.18/1.21
tst400a          0.37/0.19/0.06 1M loop          3.60/2.19/1.35
tst400a2         0.53/0.37/0.04	1M loop inc	 0.93/0.17/0.71
tst400bHuge      0.97/0.72/0.10 100k count (io) 12.21/10.60/1.40
tst400cHuge      1.12/0.94/0.14 100k insert(io)  5.71/4.01/1.45 
tst400d          0.80/0.60/0.08 1M insert tuple  9.24/8.39/0.75
tst400e          1.37/1.2/0.04 1M incr fcn       4.91/4.06/0.78

tst901a          1.04/0.76/0.17 [+] call         9.31/8.37/0.84
tst901b          3.92/3.59/0.18 [+] call        24.45/23.02/0.83

======================== 16 Oct 2004 ===================
The situation on Stem with a new kernel
The compilation mode for both V5 and V4.4 is CFLAGS=-O2
command: time Mserver -c perf.conf TST </dev/null >/dev/null
                 V4.99 				V4.4
base             0.26/0.09/0.04                  0.81/0.14/0.61
tst400a          0.36/0.20/0.04 1M loop          2.73/2.07/0.61
tst400a2         0.53/0.35/0.04	1M loop inc	 0.81/0.15/0.61
tst400bHuge      0.97/0.70/0.09 100k count (io) 11.38/10.18/0.94
tst400cHuge      1.12/0.94/0.15 100k insert(io)  5.00/3.80/1.12 
tst400d          0.75/0.57/0.05 1M insert tuple  8.71/7.88/0.70
tst400e          1.32/1.15/0.04 1M incr fcn       4.91/4.06/0.78

tst901a          1.03/0.77/0.15 [+] call         8.87/8.02/0.68
tst901b          3.61/3.33/0.16 [+] call        22.77/21.96/0.70

======================== 17 Oct 2004 ===================
Running V4.4 using embedded features
Mserver --set monet_embedded=yes --dbinit="module(mapi);listen();"
and running tests via MapiClient
                 V4.4
base             0.16
tst400a          2.2 	1M loop         
tst400a2         	1M loop inc
tst400bHuge      13.6	100k count (io) 
tst400cHuge      4.8	100k insert(io)
tst400d          7.5	1M insert tuple
tst400e          4.2	1M incr fcn    

tst901a          7.5	[+] call      
tst901b          21.2	[+] call     

======================== 11 Nov 2004 ===================
Setup on Orion with Fedora 2 patched 
The compilation mode for both V5 and V4.4 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 V4.5
base		 0.11/0.06/0.05			 0.81/0.15/0.65	
tst400a		 0.20/0.16/0.04	1M loop	 	 2.86/2.16/0.69
tst400a2	 0.34/0.29/0.05	1M loop inc	
tst400bHuge 	 0.82/0.71/0.10	100k count (io)	11.70/10.61/0.99
tst400cHuge	 1.03/0.89/0.14	100k insert(io)	 4.72/3.61/1.07
tst400d		 0.57/0.52/0.05	1M insert tuple  7.78/7.08/0.68
tst400e		 1.18/1.14/0.04	1M incr fcn      4.78/4.17/0.60

tst901a		 0.86/0.68/0.16	[+] call	 7.56/6.82/0.69
tst901b		 3.57/3.43/0.13	[+] call	20.10/19.18/0.81
======================== 20 dec 2004 ===================
Setup on Orion with bstream enhancement 
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.12/0.07/0.05			
tst400a		 0.22/0.17/0.04	1M loop	 	
tst400a2	 0.36/0.30/0.05	1M loop inc	
tst400bHuge 	 0.83/0.72/0.10	100k count (io)
tst400cHuge	 1.20/1.07/0.11	100k insert(io)	
tst400d		 0.66/0.59/0.07	1M insert tuple 
tst400e		 1.11/1.06/0.05	1M incr fcn     

tst901a		 0.90/0.74/0.15	[+] call	 
tst901b		 3.60/3.42/0.16	[+] call	
======================== 22 feb 2005 ===================
Setup on Orion just before refcount updates
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.12/0.07/0.05			
tst400a		 0.19/0.15/0.04	1M loop	 	
tst400a2	 0.34/0.31/0.02	1M loop inc	
tst400bHuge	 0.80/0.72/0.08	100k count (io)
tst400cHuge	 1.20/1.07/0.11	100k insert(io)	
tst400d		 0.66/0.59/0.07	1M insert tuple 
tst400e		 1.16/1.11/0.05	1M incr fcn     

tst901a		 0.78/0.65/0.12	[+] call	 
tst901b		 3.60/3.42/0.16	[+] call	
======================== 15 Apr 2005 ===================
Setup on Orion just after the parser requires full qualified names.
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.08/0.06/0.02			
tst400a		 0.20/0.18/0.02	1M loop	 	
tst400a2	 0.34/0.31/0.02	1M loop inc	
tst400bHuge  0.84/0.73/0.08	100k count (io)
tst400cHuge	 1.20/1.07/0.11	100k insert(io)	
tst400d		 1.09/1.00/0.07	1M insert tuple 
tst400e		 1.09/1.07/0.02	1M incr fcn     

tst901a		 1.20/1.16/0.12	[+] call	 
tst901b		 5.07/4.92/0.14	[+] call	
The performance drain is significant, most likely related to
correcting incref/fix/unfix. Callgrind traces collected in callgrind/20050415

======================== 17 Apr 2005 ===================
Setup on Orion just after the parser requires full qualified names
and optimization to reduce parsing and type checking.
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.08/0.06/0.02			
tst400a		 0.20/0.17/0.02	1M loop	 	
tst400a2	 0.34/0.31/0.02	1M loop inc	
tst400bHuge	 0.64/0.53/0.09	100k count (io)
tst400cHuge	 0.79/0.68/0.11	100k insert(io)	
tst400d		 1.07/1.02/0.04	1M insert tuple 
tst400e		 1.09/1.07/0.02	1M incr fcn     

tst901a		 1.20/1.08/0.12	[+] call	 
tst901b		 5.07/4.92/0.14	[+] call	
The performance drain is significant, most likely related to
correcting incref/fix/unfix. Callgrind traces collected in callgrind/20050415

======================== 20 Apr 2005 ===================
Setup on Orion just after the descriptor change and minor polishing of M5
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.08/0.06/0.02			
tst400a		 0.19/0.16/0.02	1M loop	 	
tst400a2	 0.34/0.31/0.02	1M loop inc	
tst400bHuge	 0.60/0.50/0.08	100k count (io)
tst400cHuge	 0.77/0.68/0.08	100k insert(io)	
tst400d		 0.96/0.89/0.08	1M insert tuple 
tst400e		 1.07/1.04/0.02	1M incr fcn     

tst901a		 1.17/1.00/0.10	[+] call	 
tst901b		 4.86/4.71/0.14	[+] call	
======================== 23 Apr 2005 ===================
Setup on Orion just after after BBP changes to reduce locks.
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.09/0.06/0.02			
tst400a		 0.19/0.16/0.02	1M loop	 	
tst400a2	 0.34/0.31/0.02	1M loop inc	
tst400bHuge	 0.59/0.49/0.08	100k count (io)
tst400cHuge	 0.73/0.64/0.08	100k insert(io)	
tst400d		 0.89/0.83/0.05	1M insert tuple 
tst400e		 1.07/1.04/0.02	1M incr fcn     

tst901a		 1.10/0.97/0.10	[+] call	 
tst901b		 4.63/4.44/0.16	[+] call	
======================== 24 Apr 2005 ===================
Setup on Orion just after after cpu savers and small bats
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.09/0.06/0.02			
tst400a		 0.18/0.16/0.02	1M loop	 	
tst400a2	 0.33/0.30/0.02	1M loop inc	
tst400bHuge	 0.56/0.48/0.08	100k count (io)
tst400cHuge	 0.73/0.65/0.08	100k insert(io)	
tst400d		 0.93/0.89/0.04	1M insert tuple 
tst400e		 1.07/1.04/0.02	1M incr fcn     

tst901a		 1.10/0.97/0.10	[+] call	 
tst901b		 4.63/4.44/0.16	[+] call	
======================== 5 May 2005 ===================
Setup on Orion after BAT reductions
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.08/0.055/0.02			
tst400a		 0.17/0.15/0.02	1M loop	 	
tst400a2	 0.30/0.28/0.02	1M loop inc	
tst400bHuge	 0.57/0.49/0.08	100k count (io)
tst400cHuge	 0.73/0.62/0.08	100k insert(io)	
tst400d		 0.91/0.87/0.04	1M insert tuple 
tst400e		 1.05/1.02/0.02	1M incr fcn     

tst901a		 1.10/0.97/0.10	[+] call	 
tst901b		 4.63/4.44/0.16	[+] call	
======================== 2 jul 2005 ===================
Setup on Orion before new machines arive, version 4.9 kernel
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		orion(2x1.4Mh load 0)		 
base		 0.10/0.07/0.03	
tst400a		 0.19/0.16/0.03	1M loop	 	
tst400a2	 0.33/0.29/0.03	1M loop inc	
tst400bHuge	 0.58/0.49/0.09	100k count (io)
tst400cHuge	 0.76/0.66/0.09	100k insert(io)	
tst400d		 0.94/0.89/0.05	1M insert tuple 
tst400e		 0.95/0.92/0.03	1M incr fcn     

tst901a		 1.14/1.03/0.11	[+] call	 
tst901b		 4.85/4.68/0.16	[+] call	
======================== 5 jul 2005 ===================
Setup on Gio version 4.9 kernel
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
base		 0.06/0.04/0.02	
tst400a		 0.12/0.10/0.02	1M loop	 	
tst400a2	 0.23/0.20/0.02	1M loop inc	
tst400bHuge	 0.30/0.27/0.03	100k count (io)
tst400cHuge	 0.44/0.39/0.03	100k insert(io)	
tst400d		 0.70/0.67/0.02	1M insert tuple 
tst400e		 0.74/0.71/0.02	1M incr fcn     

tst901a		 0.77/0.72/0.04	[+] call	 
tst901b		 3.50/3.38/0.08	[+] call	
======================== 11 jul 2005 ===================
Setup on Gio version 4.9 kernel after complete recompilation
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
base		 0.06/0.04/0.02	
tst400a		 0.11/0.09/0.02	1M loop	 	
tst400a2	 0.16/0.14/0.02	1M loop inc	
tst400bHuge	 0.24/0.20/0.04	100k count (io)
tst400cHuge	 0.32/0.26/0.05	100k insert(io)	
tst400d		 0.40/0.38/0.02	1M insert tuple 
tst400e		 0.46/0.44/0.02	1M incr fcn     

tst901a		 0.45/0.41/0.04	[+] call	 
tst901b		 1.85/1.78/0.06	[+] call	
======================== 16 aug 2005 ===================
Setup on Gio version 4.9 kernel after beautifier
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
base		 0.055/0.038/0.014	
tst400a		 0.106/0.091/0.014	1M loop	 	
tst400a2	 0.163/0.145/0.013	1M loop inc	
tst400bHuge	 0.269/0.216/0.046	100k count (io)
tst400cHuge	 0.363/0.318/0.045	100k insert(io)	
tst400d		 0.412/0.374/0.025	1M insert tuple 
tst400e		 0.470/0.440/0.021	1M incr fcn     

tst901a		 0.472/0.416/0.040	[+] call	 
tst901b		 2.036/1.910/0.067	[+] call	
======================== 19 aug 2005 ===================
Setup on Gio version 4.9 kernel after code squeeze
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
base		 0.055/0.038/0.014	
tst400a		 0.111/0.091/0.014	1M loop	 	
tst400a2	 0.165/0.147/0.013	1M loop inc	
tst400bHuge	 0.249/0.200/0.034	100k count (io)
tst400cHuge	 0.327/0.266/0.045	100k insert(io)	
tst400d		 0.444/0.409/0.025	1M insert tuple 
tst400e		 0.412/0.387/0.015	1M incr fcn     

tst901a		 0.508/0.445/0.040	[+] call	 
tst901b		 2.048/1.914/0.053	[+] call	
======================== 20 aug 2005 ===================
Setup on Gio version 4.9 kernel after enabling alloc_map
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.048/0.036/0.012			0.033/0.024/0.009   0.7
tst400a		 0.104/0.093/0.011	1M loop	 	0.870/0.859/0.011   8.3
tst400a2	 0.155/0.149/0.006	1M loop inc		
tst400bHuge	 0.222/0.184/0.039	100k count (io) 1.115/0.963/0.152   5.0
tst400cHuge	 0.306/0.266/0.040	100k insert(io)	1.693/1.466/0.217   5.5
tst400d		 0.409/0.381/0.027	1M insert tuple 2.519/2.503/0.015   6.1
tst400e		 0.382/0.365/0.016	1M incr fcn     1.787/1.777/0.010   4.7

tst901a		 0.462/0.431/0.031	[+] call    2.474/2.430/0.044   5.3
tst901b		 1.865/1.776/0.059	[+] call    7.309/7.244/0.064   3.9
======================== 13 sep 2005 ===================
Setup on Gio version 4.9 kernel 
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.048/0.036/0.012			0.033/0.024/0.009   0.7
tst400a		 0.114/0.098/0.016	1M loop	 	0.870/0.859/0.011   8.3
tst400a2	 0.165/0.149/0.016	1M loop inc		
tst400bHuge	 0.252/0.218/0.031	100k count (io) 1.115/0.963/0.152   5.0
tst400cHuge	 0.304/0.251/0.047	100k insert(io)	1.693/1.466/0.217   5.5
tst400d		 0.310/0.282/0.022	1M insert tuple 2.519/2.503/0.015   6.1
tst400e		 0.399/0.374/0.018	1M incr fcn     1.787/1.777/0.010   4.7

tst901a		 0.473/0.324/0.042	[+] call    2.474/2.430/0.044   5.3
tst901b		 1.702/1.615/0.054	[+] call    7.309/7.244/0.064   3.9
======================== 25 sep 2005 ===================
Setup on Gio version 4.9 kernel 
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.066/0.046/0.020				0.033/0.024/0.009   0.5
tst400a		 0.116/0.102/0.014	1M loop	 	0.870/0.859/0.011   7/5
tst400a2	 0.169/0.159/0.011	1M loop inc		
tst400bHuge	 0.242/0.197/0.045	100k count (io) 1.115/0.963/0.152   4.6
tst400cHuge	 0.290/0.242/0.048	100k insert(io)	1.693/1.466/0.217   5.8
tst400d		 0.307/0.284/0.022	1M insert tuple 2.519/2.503/0.015   8.2
tst400e		 0.392/0.374/0.018	1M incr fcn     1.787/1.777/0.010   4.5

tst901a		 0.368/0.320/0.047	[+] call    2.474/2.430/0.044   6.7
tst901b		 1.665/1.603/0.061	[+] call    7.309/7.244/0.064   4.3
smalltable   0.077/0.058/0.019
======================== 5 okt sep 2005 ===================
Setup on Gio version 4.9 kernel , after SQL has been included
and 'all' data structures are properly released.
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.098/0.057/0.020				0.033/0.024/0.009   0.5
tst400a		 0.151/0.131/0.019	1M loop	 	0.870/0.859/0.011   7/5
tst400a2	 0.215/0.164/0.021	1M loop inc		
tst400bHuge	 0.291/0.227/0.047	100k count (io) 1.115/0.963/0.152   4.6
tst400cHuge	 0.345/0.254/0.050	100k insert(io)	1.693/1.466/0.217   5.8
tst400d		 0.347/0.299/0.029	1M insert tuple 2.519/2.503/0.015   8.2
tst400e		 0.447/0.390/0.020	1M incr fcn     1.787/1.777/0.010   4.5

tst901a		 0.428/0.343/0.048	[+] call    2.474/2.430/0.044   6.7
tst901b		 1.754/1.590/0.073	[+] call    7.309/7.244/0.064   4.3
smalltable   0.114/0.071/0.024
map100		 4.540/0.297/0.211
======================== 14 okt sep 2005 ===================
Setup on Gio version 4.9 kernel , after code squeezing for SQL
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.065/0.040/0.020				0.033/0.024/0.009   0.5
tst400a		 0.115/0.088/0.028	1M loop	 	0.870/0.859/0.011   7.5
tst400a2	 0.168/0.144/0.024	1M loop inc		
tst400bHuge	 0.228/0.172/0.052	100k count (io) 1.115/0.963/0.152   4.9
tst400cHuge	 0.287/0.204/0.084	100k insert(io)	1.693/1.466/0.217   5.9
tst400d		 0.301/0.260/0.040	1M insert tuple 2.519/2.503/0.015   8.3
tst400e		 0.423/0.388/0.028	1M incr fcn     1.787/1.777/0.010   4.2

tst901a		 0.365/0.308/0.056	[+] call    2.474/2.430/0.044   6.7
tst901b		 1.576/1.488/0.088	[+] call    7.309/7.244/0.064   4.6
smalltable   0.074/0.060/0.012
======================== 16 nov sep 2005 ===================
Setup on Gio version 4.9 kernel , before and after Peter's MAL cleanup
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.114/0.040/0.060				0.050/0.024/0.012   	0.160
tst400a		 0.164/0.100/0.056	1M loop	 	0.806/0.788/0.008   	0.964/0.940/0.020
tst400bHuge	 0.288/0.196/0.080	100k count (io) 1.161/0.948/0.160   4.560/4.396/0112
tst400cHuge	 0.333/0.240/0.084	100k insert(io)	1.737/1.428/0.224	2.357/2.224/0.128
tst400d		 0.364/0.284/0.068	1M insert tuple 2.352/2.240/0.024	2.528/2.500/0.024
tst400e		 0.471/0.412/0.044	1M incr fcn     1.668/1.628/0.024   1.990/1.964/0.024

tst901a		 0.427/0.352/0.064	[+] call    2.391/2.244/0.060   	2.659/2.484/0.076
tst901b		 1.712/1.608/0.088	[+] call    6.605/6.408/0.060   	7.248/7.104/0.072
smalltable   0.125/0.048/0.060
======================== 20 nov sep 2005 ===================
Setup on Gio version 4.9 kernel , After space reduction M5
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5				MonetDB 4.9.3
base		 0.100/0.044/0.056				0.050/0.024/0.012   	0.160
tst400a		 0.144/0.092/0.052	1M loop	 	0.806/0.788/0.008   	0.964/0.940/0.020
tst400bHuge	 0.239/0.188/0.052	100k count (io) 1.161/0.948/0.160   4.560/4.396/0112
tst400cHuge	 0.290/0.212/0.076	100k insert(io)	1.737/1.428/0.224	2.357/2.224/0.128
tst400d		 0.335/0.292/0.044	1M insert tuple 2.352/2.240/0.024	2.528/2.500/0.024
tst400e		 0.455/0.404/0.052	1M incr fcn     1.668/1.628/0.024   1.990/1.964/0.024

tst901a		 0.403/0.332/0.072	[+] call    2.391/2.244/0.060   	2.659/2.484/0.076
tst901b		 1.610/1.504/0.106	[+] call    6.605/6.408/0.060   	7.248/7.104/0.072
smalltable   0.125/0.048/0.060
======================== 20 nov sep 2005 ===================
Setup on Gio version 4.9 kernel , After cleanup of statistics 
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.055/0.044/0.012					0.032/0.024/0.008	0.6x
tst400a		 0.107/0.096/0.012	1M loop	 		0.869/0.860/0.008	8.1x  (16x)
tst400bHuge	 0.201/0.156/0.044	100k count (io) 1.632/1.564/0.068 	8.1x (10.9x)
tst400cHuge	 0.249/0.204/0.044	100k insert(io)	1.972/1.888/0.084   7.9x (9.8)
tst400d		 0.298/0.280/0.016	1M insert tuple  2.615/2.596/0.020	8.7x (10.6x)
tst400e		 0.408/0.388/0.020	1M incr fcn      1.759/1.744/0.016  4.3x (4.9x)


tst901a		 0.361/0.316/0.044	[+] call    	2.653/2.620/0.032	7.3x (8.6x)
tst901b		 1.671/1.612/0.060	[+] call   		7.649/7.580/0.064	4.6x (4.7x)
smalltable   0.064/0.052/0.012
======================== 18 dec sep 2005 ===================
Setup on Gio version 4.9 kernel , After cleanup of statistics 
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.055/0.044/0.012					0.032/0.024/0.008	
tst400a		 0.115/0.100/0.015	1M loop	 		0.869/0.860/0.008
tst400bHuge	 0.208/0.192/0.016	100k count (io) 1.632/1.564/0.068
tst400cHuge	 0.270/0.244/0.028	100k insert(io)	1.972/1.888/0.084
tst400d		 0.309/0.276/0.032	1M insert tuple  2.615/2.596/0.020
tst400e		 0.395/0.384/0.008	1M incr fcn      1.759/1.744/0.016

tst901a		 0.370/0.336/0.032	[+] call    	2.653/2.620/0.032
tst901b		 1.728/1.672/0.056	[+] call   		7.649/7.580/0.064
smalltable   0.064/0.052/0.012
======================== 10 jan 2006 ===================
Setup on Gio version 4.9 kernel , after code squeezing
Changed profile to ignore mapi thread
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.073/0.044/0.024					0.032/0.024/0.008	0.4x
tst400a		 0.115/0.096/0.020	1M loop	 		0.869/0.860/0.008	7.5x (19.9x)
tst400bHuge	 0.210/0.164/0.040	100k count (io) 1.632/1.564/0.068   7.7x (11.6x)
tst400cHuge	 0.276/0.216/0.052	100k insert(io)	1.972/1.888/0.084	7.1x ( 9.5x)
tst400d		 0.310/0.272/0.024	1M insert tuple  2.615/2.596/0.020  8.4x (10.9x)
tst400e		 0.391/0.352/0.024	1M incr fcn      1.759/1.744/0.016	4.5x (5.4x)

tst901a		 0.378/0.304/0.060	[+] call    	2.653/2.620/0.032	7.0x (8.6x)
tst901b		 1.672/1.548/0.056	[+] call   		7.649/7.580/0.064	4.5x (4.7)
smalltable   0.064/0.052/0.012
======================== 18 mar 2006 ===================
Setup on Gio version 4.9 kernel , before Peter's all kernel changes take effect
Kernel changes related to BBP updates and caching now active
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.056/0.040/0.016
tst400a		 0.097/0.072/0.024	1M loop
tst400bHuge	 0.196/0.164/0.032	100k count (io)
tst400cHuge	 0.256/0.216/0.040	100k insert(io)
tst400d		 0.365/0.292/0.065	1M insert tuple  
tst400e		 0.348/0.328/0.016	1M incr fcn

tst901a		 0.509/0.380/0.116	[+] call
tst901b		 1.813/1.556/0.224	[+] call   		
smalltable   0.064/0.052/0.012
======================== 11 apr 2006 ===================
Setup on Gio version 4.9 kernel, 
Athlon 64 3000+ converted to X2 3800

The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.051/0.036/0.012
tst400a		 0.089/0.076/0.012	1M loop
tst400bHuge	 0.177/0.156/0.020	100k count (io)
tst400cHuge	 0.232/0.200/0.032	100k insert(io)
tst400d		 0.310/0.256/0.052	1M insert tuple  
tst400e		 0.325/0.304/0.016	1M incr fcn

tst901a		 0.459/0.352/0.108	[+] call
tst901b		 1.645/1.408/0.232	[+] call   		
smalltable   0.059/0.032/0.024
======================== 28 May 2006 ===================
Setup on Gio version 4.9 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.076/0.040/0.036					+25/ +4/+24 ms
tst400a		 0.114/0.084/0.032	1M loop			+15/ +8/+20 ms
tst400bHuge	 0.214/0.172/0.040	100k count (io)	+37/+16/+20 ms
tst400cHuge	 0.269/0.212/0.060	100k insert(io) +37/+12/+28 ms
tst400d		 0.372/0.288/0.084	1M insert tuple +62/+32/+32 ms
tst400e		 0.334/0.304/0.028	1M incr fcn		 +9/  0/+12 ms

tst901a		 0.521/0.336/0.188	[+] call		+62/-16/+80 ms
tst901b		 1.721/1.396/0.324	[+] call   		+76/-12/+92 ms
======================== 28 May 2006 ===================
Setup on Gio version 4.9 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for both V5 is CFLAGS=-O2 
command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							MonetDB 4.9.3
base		 0.088/0.052/0.034					
tst400a		 0.125/0.076/0.040	1M loop			
tst400bHuge	 0.223/0.168/0.056	100k count (io)	
tst400cHuge	 0.284/0.232/0.048	100k insert(io) 
tst400d		 0.410/0.292/0.084	1M insert tuple +62/+32/+32 ms
tst400e		 0.361/0.320/0.036	1M incr fcn		 +9/  0/+12 ms

tst901a		 0.532/0.369/0.160	[+] call		+62/-16/+80 ms
tst901b		 1.853/1.580/0.268	[+] call   		+76/-12/+92 ms
======================== 2 April 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is ???

command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.162/0.121/0.039					
tst400a		 0.399/0.339/0.044	1M loop			
tst400bHuge	 0.223/0.168/0.056	100k count (io)	
tst400cHuge	 0.284/0.232/0.048	100k insert(io) 
tst400d		 0.410/0.292/0.084	1M insert tuple +62/+32/+32 ms
tst400e		 0.361/0.320/0.036	1M incr fcn		 +9/  0/+12 ms

tst901a		 0.532/0.369/0.160	[+] call		+62/-16/+80 ms
tst901b		 1.853/1.580/0.268	[+] call   		+76/-12/+92 ms
======================== 3 May 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is --disable-optimize --enable-debug
Without the BATsettrv

command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.175/0.113/0.057					
tst400a		 0.404/0.356/0.047	1M loop			
tst400bHuge	 0.824/0.704/0.084	100k count (io)	
tst400cHuge	 1.026/0.946/0.079	100k insert(io) 
tst400d		 1.568/1.397/0.170	1M insert tuple 
tst400e		 1.415/1.373/0.033	1M incr fcn	

tst901a		 1.636/1.468/0.168	[+] call
tst901b		 3.578/3.354/0.221	[+] call
======================== 3 May 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is --disable-optimize --enable-debug
After addition of the BATsettrv

command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.187/0.131/0.052					
tst400a		 0.401/0.357/0.044	1M loop			
tst400bHuge	 0.765/0.683/0.079	100k count (io)	
tst400cHuge	 1.077/0.984/0.084	100k insert(io) 
tst400d		 1.302/1.160/0.137	1M insert tuple 
tst400e		 1.466/1.396/0.050	1M incr fcn	

tst901a		 1.637/1.505/0.162	[+] call
tst901b		 3.949/3.680/0.229	[+] call
======================== 4 May 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is --enable-optimize --disable-debug
After addition of the BATsettrv

command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.135/0.090/0.044					
tst400a		 0.221/0.177/0.027	1M loop			
tst400bHuge	 0.421/0.331/0.065	100k count (io)	
tst400cHuge	 0.491/0.408/0.081	100k insert(io) 
tst400d		 0.742/0.558/0.168	1M insert tuple 
tst400e		 0.732/0.684/0.030	1M incr fcn	

tst901a		 0.965/0.741/0.194	[+] call
tst901b		 1.463/1.255/0.158	[+] call
======================== 4 May 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is --enable-optimize --disable-debug
After deletion of the BATsettrv

command: time Mserver -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.135/0.090/0.044					
tst400a		 0.221/0.177/0.027	1M loop			
tst400bHuge	 0.440/0.347/0.073	100k count (io)	
tst400cHuge	 0.540/0.440/0.091	100k insert(io) 
tst400d		 0.779/0.558/0.189	1M insert tuple 
tst400e		 0.732/0.681/0.037	1M incr fcn	

tst901a		 0.912/0.710/0.222	[+] call
tst901b		 2.518/2.212/0.265	[+] call
Conclusion, so far no big difference wrt settrivial properties
======================== 28 Aug 2007 ===================
Setup on Gio version 5.0-beta2 kernel, 
Athlon 64 3000+ converted to X2 3800
The compilation mode for V5 is --enable-optimize --disable-debug

command: time mserver5 -c perf.conf TST </dev/null >/dev/null
		gio( load 0)		 
		MonetDB 5							
base		 0.196/0.150/0.045					
tst400a		 0.200/0.154/0.043	1M loop			
tst400bHuge	 0.889/0.666/0.223	100k count (io)	
tst400cHuge	 0.978/0.754/0.223	100k insert(io) 
tst400d		 0.756/0.583/0.173	1M insert tuple 
tst400e		 0.715/0.675/0.039	1M incr fcn	

tst901a		 0.912/0.726/0.186	[+] call
tst901b		 0.875/0.698/0.176	[+] call
Conclusion, a big discrepancy.
======================== 2 Sep 2007 ===================
Setup on Gio version 5.0-beta2 kernel,
Athlon 64 3000+ converted to X2 3800

The mal_instruction pushArgument contained an error
The compilation mode for V5 is --ENABLE-DEBUG --DISABLE-OPTIMIZE
base		 0.190/0.144/0.046					
tst400a		 0.419/0.364/0.052	1M loop			
tst400bHuge	 0.755/0.692/0.063	100k count (io)	
tst400cHuge	 1.005/0.972/0.077	100k insert(io) 
tst400d		 1.566/1.364/0.182	1M insert tuple 
tst400e		 1.179/1.127/0.052	1M incr fcn	

tst901a		 1.601/1.428/0.150	[+] call
tst901b		 3.555/3.326/0.225	[+] call
======================== 2 Sep 2007 ===================
Setup on Gio version 5.0-beta2 kernel,
Athlon 64 3000+ converted to X2 3800

TURN ON OPTIMIZATION AGAIN
The compilation mode for V5 is --enable-optimize --disable-debug
base		 0.138/0.10q/0.037					
tst400a		 0.212/0.163/0.049	1M loop			
tst400bHuge	 0.390/0.324/0.065	100k count (io)	
tst400cHuge	 0.494/0.422/0.071	100k insert(io) 
tst400d		 0.832/0.674/0.156	1M insert tuple 
tst400e		 0.732/0.687/0.045	1M incr fcn	

tst901a		 0.929/0.746/0.175	[+] call
tst901b		 1.428/1.271/0.157	[+] call
======================== 2 Sep 2007 ===================
Setup on Gio version 5.0-beta2 kernel,
Athlon 64 3000+ converted to X2 3800
After introduction of SLOW/FAST mal_interpreter modes
and no energy saver mode.
The compilation mode for V5 is --enable-optimize --disable-debug
base		 0.075/0.055/0.019					
tst400a		 0.110/0.087/0.022	1M loop			
tst400bHuge	 0.214/0.175/0.037	100k count (io)	
tst400cHuge	 0.xxx/0.xxx/0.xxx	100k insert(io) 
tst400d		 0.398/0.293/0.095	1M insert tuple 
tst400e		 0.361/0.335/0.042	1M incr fcn	

tst901a		 0.479/0.348/0.126	[+] call
tst901b		 1.369/1.202/0.161	[+] call
======================== 5 Oct 2007 ===================
Setup on Gio version 5.0-beta2 kernel,
Athlon 64 3000+ converted to X2 3800
After rolling forward GDK2
The compilation mode for V5 is --enable-optimize --disable-debug
base		 0.069/0.051/0.018					
tst400a		 0.103/0.076/0.024	1M loop			
tst400bHuge	 0.207/0.176/0.030	100k count (io)	
tst400cHuge	 0.277/0.225/0.051	100k insert(io) 
tst400d		 0.347/0.290/0.055	1M insert tuple 
tst400e		 0.373/0.356/0.017	1M incr fcn	

tst901a		 0.416/0.347/0.068	[+] call
tst901b		 1.238/1.135/0.098	[+] call
======================== 1 Nov 2007 ===================
Setup on Gio version 5.0-beta2 kernel,
Athlon 64 3000+ converted to X2 3800
After rolling forward GDK2
The compilation mode for V5 is --enable-optimize --disable-debug
base		 0.077/0.042/0.032					
tst400a		 0.100/0.074/0.027	1M loop			
tst400bHuge	 0.225/0.186/0.039	100k count (io)	
tst400cHuge	 0.301/0.250/0.046	100k insert(io) 
tst400d		 0.347/0.280/0.063	1M insert tuple 
tst400e		 0.357/0.321/0.032	1M incr fcn	

tst901a		 0.437/0.349/0.077	[+] call
tst901b		 1.308/1.178/0.085	[+] call
======================== 21 Feb 2008 ===================
Setup on Gio version 5.0-beta2 kernel,
After new distribution and on FC8.
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.142/0.100/0.042					
tst400a		 0.220/0.177/0.043	1M loop			
tst400bHuge	 0.444/0.377/0.067	100k count (io)	
tst400cHuge	 0.560/0.498/0.064	100k insert(io) 
tst400d		 0.752/0.630/0.121	1M insert tuple 
tst400e		 0.801/0.758/0.043	1M incr fcn	

tst901a		 0.910/0.758/0.153	[+] call
tst901b		 1.408/1.356/0.085	[+] call
======================== 21 Feb 2008 ===================
Setup on Gio version 5.0-beta2 kernel,
After new distribution and on FC8.
After disabling cpuspeed step on the Athlon
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.074/0.050/0.025					
tst400a		 0.110/0.088/0.022	1M loop			
tst400bHuge	 0.234/0.190/0.044	100k count (io)	
tst400cHuge	 0.293/0.243/0.049	100k insert(io) 
tst400d		 0.392/0.327/0.064	1M insert tuple 
tst400e		 0.390/0.366/0.025	1M incr fcn	

tst901a		 0.402/0.274/0.128	[+] call
tst901b		 1.378/1.282/0.096	[+] call
======================== 5 May 2008 ===================
Before new machines arive
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.147/0.115/0.036					
tst400a		 0.229/0.194/0.035	1M loop			
tst400bHuge	 0.449/0.376/0.073	100k count (io)	
tst400cHuge	 0.602/0.512/0.090	100k insert(io) 
tst400d		 0.799/0.698/0.101	1M insert tuple 
tst400e		 0.460/0.432/0.027	1M incr fcn	

tst901a		 1.080/0.922/0.156	[+] call
tst901b		 2.020/1.841/0.176	[+] call
======================== 21 May 2008 ===================
The new machines arrived, eir.ins.cwi.nl
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.047/0.036/0.009					
tst400a		 0.070/0.053/0.016	1M loop			
tst400bHuge	 0.137/0.116/0.022	100k count (io)	
tst400cHuge	 0.185/0.165/0.021	100k insert(io) 
tst400d		 0.228/0.196/0.031	1M insert tuple 
tst400e		 0.223/0.211/0.011	1M incr fcn	

tst901a		 0.271/0.224/0.047	[+] call
tst901b		 0.737/0.695/0.042	[+] call
======================== 14 Feb 2009 ===================
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.060/0.043/0.014					
tst400a		 0.217/0.198/0.016	1M loop			
tst400bHuge	 0.162/0.129/0.030	100k count (io)	
tst400cHuge	 0.239/0.205/0.029	100k insert(io) 
tst400d		 0.679/0.649/0.027	1M insert tuple 
tst400e		 0.894/0.874/0.018	1M incr fcn	

tst901a		 0.866/0.815/0.047	[+] call
tst901b		 2.601/2.531/0.066	[+] call
======================== 14 Feb 2009 ===================
Make sure the complete power of the cpu is used (cpuspeed)
and reduced some cost in the interpreter
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.058/0.037/0.020					
tst400a		 0.080/0.061/0.016	1M loop			
tst400bHuge	 0.148/0.113/0.034	100k count (io)	
tst400cHuge	 0.217/0.183/0.032	100k insert(io) 
tst400d		 0.320/0.271/0.047	1M insert tuple 
tst400e		 0.227/0.214/0.011	1M incr fcn	

tst901a		 0.368/0.314/0.051	[+] call
tst901b		 0.906/0.856/0.048	[+] call
======================== 12 Sep 2009 ===================
Make sure the complete power of the cpu is used (cpuspeed)
and reduced some cost in the interpreter
The compilation mode for V5 is --enable-optimize --disable-debug
command: time mserver5 -c perf.conf TST </dev/null >/dev/null
base		 0.066/0.043/0.020					
tst400a		 0.089/0.068/0.020	1M loop			
tst400bHuge	 0.157/0.120/0.034	100k count (io)	
tst400cHuge	 0.256/0.226/0.025	100k insert(io) 
tst400d		 0.533/0.511/0.020	1M insert tuple 
tst400e		 0.250/0.229/0.011	1M incr fcn	

tst901a		 0.583/0.539/0.038	[+] call
tst901b		 1.654/1.605/0.033	[+] call
======================== 2 Feb 2011 ===================
Make sure the complete power of the cpu is used (cpuspeed)
and reduced some cost in the interpreter
The compilation mode for V5 is --disable-optimize --enable-debug
command: time mserver5  TST </dev/null >/dev/null
base		0.163/0.089/0.026
tst400a		0.216/0.154/0.023
tst400bHuge	0.367/0.324/0.039
tst400cHuge	0.619/0.536/0.038
tst400d		1.268/1.193/0.039
tst400e		0.867/0.836/0.025

tst901b		4.021/3.973/0.041
======================== 2 Feb 2011 ===================
Make sure the complete power of the cpu is used (cpuspeed)
and reduced some cost in the interpreter
The compilation mode for V5 is --enable-optimize --disable-debug
compilation took 28 minutes
command: time mserver5 TST </dev/null >/dev/null
base		0.115/0.062/0.027
tst400a		0.115/0.082/0.026
tst400bHuge	0.216/0.162/0.043
tst400cHuge	0.369/0.275/0.043
tst400d		0.666/0.624/0.030
tst400e		0.466/0.421/0.031

tst901b		1.918/1.867/0.043
======================== 2 June 2012 ===================
Vienna Fedora 16
The compilation mode for V5 is --enable-optimize --disable-debug
compilation took 7m25 
command: time mserver5 TST </dev/null >/dev/null
base		0.059/0.037/0.015.
tst400a		0.110/0.073/0.011
tst400bHuge	0.161/0.117/0.016
tst400cHuge	0.210/0.154/0.018
tst400d		0.459/0.407/0.012
tst400e		0.510/0.468/0.012

tst901a		0.060/0.040/0.014
tst901b		1.462/1.422/0.025
======================== 6 oct 2012 ===================
Vienna Fedora 16
Oct2012 release
The compilation mode for V5 is --enable-optimize --disable-debug
compilation took 5m25 
command: time mserver5 TST </dev/null >/dev/null
base		0.110/0.040/0.012.
tst400a		0.160/0.076/0.007
tst400bHuge	0.210/0.119/0.011
tst400cHuge	0.211/0.132/0.015
tst400d		0.463/0.370/0.015
tst400e		0.613/0.524/0.013

tst901a		1.315/0.651/0.595
tst901b		1.313/1.222/0.016
======================== 20 mar 2013 ===================
Vienna Fedora 18
Default release
The compilation mode for is --enable-optimize --disable-debug
compilation took ?
command: time mserver5 TST </dev/null >/dev/null
base		0.212/0.046/0.016.
tst400a		0.211/0.079/0.014
tst400bHuge	0.263/0.109/0.020
tst400cHuge	0.264/0.120/0.023
tst400d		0.511/0.319/0.020
tst400e		0.613/0.470/0.012

tst901a		0.655/0.409/0.038
tst901b		1.417/1.100/0.038
======================== 12 oct 2014 ===================
Vienna Fedora 20 
Default release
The compilation mode for is --enable-optimize --disable-debug
compilation took ?
command: time mserver5 TST </dev/null >/dev/null
base		0.111/0.095/0.014
tst400a		0.261/0.190/0.021
tst400bHuge	0.262/0.197/0.027
tst400cHuge	0.262/0.209/0.030
tst400d		0.661/0.601/0.027
tst400e		0.861/0.814/0.016

tst901a		0.812/0.750/0.032
tst901b		2.112/2.051/0.038
