PKS 5.0: puzzled by 32-bit vs 64-bit performance
CADForums.net Forum Index CADForums.net
Discussion of AutoCAD and other CAD software.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web cadforums.net
PKS 5.0: puzzled by 32-bit vs 64-bit performance

 
Post new topic   Reply to topic    CADForums.net Forum Index -> Cadence
Author Message
anon_poster
Guest





Posted: Sun Nov 07, 2004 11:24 pm    Post subject: PKS 5.0: puzzled by 32-bit vs 64-bit performance Reply with quote

I got caught up in AMD's "64-bit hype" machine, so I went
out and bought a cheap Athlon64 system, set it up with
a whiteboxlinux.org (Redhat RHEL 3 clone) installation,
then ran some simplistic synthesis tests.

The results really confounded me:

(A) = Athlon64 2800+ (1.8GHz Socket754), 1.0GB PC2700 CL2.5 DDR,
(motherboard is an ECS 755-A2 v1.0)
WhiteboxLinux 3.0 Respin1 x86_64 (similar to RHEL 3.0 update 2)

(B) = dual Pentium3/S 1.26GHz, 4.0GB PC133 CL3 (reg,ECC) SDRAM
(motherboard is a Supermicro P3TDDE)
Redhat 8.0 linux, plus *all* released Redhat RPM updates

(C) = Sun Blade-1000 2750 (dual USparc3 750MHz, 8MB cache), 8.0GB RAM
Solaris 8, base installation (no updates)


Machine Software RAM-usage runtime
------- ----------------- --------- -------
(A) x86 PKS5 32-bit ~300MB 29min
(A) x86 PKS5 64-bit ~400MB 22min
(B) x86 PKS5 32-bit ~300MB 50min
(C) SunOS PKS5 32-bit ~300MB 70min
(C) SunOS PKS5 64-bit ~400MB 80min


Software = Cadence SPR50 (October 2004 update)

"Test-case" is a simple (<80Kgates) Verilog-HDL compile, from RTL all
the way to placed-gates + clock-tree synthesis (no routing, no DFT, no
lower-power stuff.)

Here's the shocker...while the Solaris 64-bit ran *SLOWER* than its
32-bit version -- the x86_64 platform did the exact opposite.
The x86_64 binary ran *FASTER* than the x86 32-bit binary.

....

What I want to know is ...

(a) Did I setup something incorrectly? (Why does 64-bit on the
linux platform run faster? Yet 64-bit on Solaris is slower?)

(b) I didn't have a chance to try Intel's 64-bit IA32e Xeon.
Does IA32e experience the same trend (i.e. 64-bit is faster)?

(c) Why is the netlist-output different among all 4 platforms?!?
I think I used identical setup-scripts for all 4 runs, but
even among the same platform (Sun, x86), the 32-bit vs 64-bit
QoR/area results differ. I guess this goes back to (a)

(d) is my testcase consistant with other peoples' experiences?
For example, does Synopsys's Design_Compiler follow the
same trend?

(e) When will all EDA vendors port *EVERYTHING* to x86_64? :)

Back to top
Diva Physical Verificatio
Guest





Posted: Mon Nov 08, 2004 1:45 am    Post subject: Re: PKS 5.0: puzzled by 32-bit vs 64-bit performance Reply with quote

I cannot speak to the Linux questions you ask, but Solaris is something
I know a few things about. The 64bit Solaris run is using the same size
processor cache as the 32bit run. When an application uses many
pointers, there are fewer pointers stored in the cache in 64bit mode
since the pointers are twice as large. This results in more cache
misses, which cause the processor too have to go to RAM more often,
resulting in more runtime.

One might think the AMD and Intel 64bit processors would have the same
issue, but processor designers do some very odd things at times. Maybe
someone at AMD anticipated this problem and did something ingenious. If
this is so, I hope AMD recognized that genius with a wad of cash.

As for why AMD runs faster in 64bit mode, we can only speculate. It may
be as simple as 32bit needing an extra step to convert addresses to
64bit since the hardware is probably all 64bit pointers internally.
Looking at the machine code might be very illuminating.

On Sun, 07 Nov 2004 18:24:25 GMT, anon_poster <anon_poster@nowhere.net>
wrote:

Quote:
I got caught up in AMD's "64-bit hype" machine, so I went
out and bought a cheap Athlon64 system, set it up with
a whiteboxlinux.org (Redhat RHEL 3 clone) installation,
then ran some simplistic synthesis tests.

The results really confounded me:

(A) = Athlon64 2800+ (1.8GHz Socket754), 1.0GB PC2700 CL2.5 DDR,
(motherboard is an ECS 755-A2 v1.0)
WhiteboxLinux 3.0 Respin1 x86_64 (similar to RHEL 3.0 update 2)

(B) = dual Pentium3/S 1.26GHz, 4.0GB PC133 CL3 (reg,ECC) SDRAM
(motherboard is a Supermicro P3TDDE)
Redhat 8.0 linux, plus *all* released Redhat RPM updates

(C) = Sun Blade-1000 2750 (dual USparc3 750MHz, 8MB cache), 8.0GB RAM
Solaris 8, base installation (no updates)


Machine Software RAM-usage runtime
------- ----------------- --------- -------
(A) x86 PKS5 32-bit ~300MB 29min
(A) x86 PKS5 64-bit ~400MB 22min
(B) x86 PKS5 32-bit ~300MB 50min
(C) SunOS PKS5 32-bit ~300MB 70min
(C) SunOS PKS5 64-bit ~400MB 80min


Software = Cadence SPR50 (October 2004 update)

"Test-case" is a simple (<80Kgates) Verilog-HDL compile, from RTL all
the way to placed-gates + clock-tree synthesis (no routing, no DFT, no
lower-power stuff.)

Here's the shocker...while the Solaris 64-bit ran *SLOWER* than its
32-bit version -- the x86_64 platform did the exact opposite.
The x86_64 binary ran *FASTER* than the x86 32-bit binary.

...

What I want to know is ...

(a) Did I setup something incorrectly? (Why does 64-bit on the
linux platform run faster? Yet 64-bit on Solaris is slower?)

(b) I didn't have a chance to try Intel's 64-bit IA32e Xeon.
Does IA32e experience the same trend (i.e. 64-bit is faster)?

(c) Why is the netlist-output different among all 4 platforms?!?
I think I used identical setup-scripts for all 4 runs, but
even among the same platform (Sun, x86), the 32-bit vs 64-bit
QoR/area results differ. I guess this goes back to (a)

(d) is my testcase consistant with other peoples' experiences?
For example, does Synopsys's Design_Compiler follow the
same trend?

(e) When will all EDA vendors port *EVERYTHING* to x86_64? :)
Back to top
Kim Enkovaara
Guest





Posted: Mon Nov 08, 2004 11:31 am    Post subject: Re: PKS 5.0: puzzled by 32-bit vs 64-bit performance Reply with quote

Diva Physical Verification wrote:

Quote:
As for why AMD runs faster in 64bit mode, we can only speculate. It may
be as simple as 32bit needing an extra step to convert addresses to
64bit since the hardware is probably all 64bit pointers internally.
Looking at the machine code might be very illuminating.

AMD also improved the x86 architecture at the same time. x86 originally
had very few registers. x86_64 has more registers for example, and
that can lead to better code generation. There are also other small
improvements.

--Kim

Back to top
gennari
Guest





Posted: Sat Nov 13, 2004 3:35 am    Post subject: Re: PKS 5.0: puzzled by 32-bit vs 64-bit performance Reply with quote

"Kim Enkovaara" <kim.enkovaara@tellabs.com> wrote in message
news:YoEjd.21$ep7.17@reader1.news.jippii.net...
Quote:
Diva Physical Verification wrote:

As for why AMD runs faster in 64bit mode, we can only speculate. It may
be as simple as 32bit needing an extra step to convert addresses to
64bit since the hardware is probably all 64bit pointers internally.
Looking at the machine code might be very illuminating.

AMD also improved the x86 architecture at the same time. x86 originally
had very few registers. x86_64 has more registers for example, and
that can lead to better code generation. There are also other small
improvements.

--Kim

Yes, in general you can't compare a 32-bit system with a 64-bit system based
on the number of bits alone. There are many differences other than the bit
width that affect performance: the architecture, memory access/bandwidth
(64-bit systems may have higher bandwidth), cache size, etc. Also, if you
compile a binary on a 32/64-bit system the compiler may or may not optimize
for that particular bit width. If you're running a 32-bit OS or binary on a
64-bit system, the 32-bit emulation might slow down software execution.

Different simulation results might be due to several factors:
The precision of 32-bit vs. 64-bit data values (though the floating-point
numbers are probably all standard IEEE 64-bit)
Different OS/compiler math libraries handling math exceptions such as divide
by zeros differently
Uninitialized memory (an error in the software)
"Non-stable" functions such as sorts (or even the ordering of pointer
values) may have been implemented differently in the libraries on the
various systems

Still, if the results differ significantly it's probably due to either user
error or software error.

Frank
Back to top
 
Post new topic   Reply to topic    CADForums.net Forum Index -> Cadence All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Windows Server DSP VoIP Electronics New Topics
Powered by phpBB