DN Staff

March 22, 1999

14 Min Read
NT workstations vie for graphic edge

More than 80% of Design News readers use Windows NT-based workstations, and one of the most critical workstation performance criteria is graphics. Vendors are beginning to break from just plugging a graphics card into a 133-MHz PCI slot, creating new ways to shatter graphics bottlenecks. Design News invited Compaq, IBM, Intergraph, and Silicon Graphics to describe how their hottest single-processor NT workstations handle graphics and what the future might hold in store. For a more quantitative approach, visit www.specbench.org for the latest benchmark results.


Compaq Professional Workstation XP1000

By Russ Doty, Technical Marketing Manager, Compaq Computer Corp.

Q: How does the Professional Workstation XP1000 handle graphics?

A: The key to XP1000 graphics is the performance of the Alpha 21264 processor. Executing two billion instructions per second, the Alpha 21264 is one of the fastest processors available for Windows NT. It delivers 1.5x the integer performance of other processors and more than 3.5x the floating-point performance. The Alpha 21264 moves the massive amounts of data encountered in technical applications with an internal bandwidth of 8 Gbytes/sec and a memory bandwidth of 2.6 Gbytes/sec. Other systems have 800 Mbyte/sec of memory bandwidth. The XP1000 further extends performance with a 333-MHz system bus and an advanced crossbar architecture to fully exploit both memory bandwidth and dual independent PCI buses.

Q: What are the advantages of this architecture?

A: The architecture of the 21264 is ideally suited to executing complex graphics algorithms. The combination of floating-point performance and high bandwidth means that the processor can handle the entire geometry pipeline at high speed. The close connection resulting from having both the application and the geometry processing done on the same processor minimizes latency--a key component of performance. These factors, combined with the high-performance 64-bit PCI bus, allow the XP1000 to deliver more graphics primitives to the graphics card.

The PowerStorm 300 graphics card is built around a high-performance graphics chip that can display 4 million triangles/ second and 90 million textured pixels/ second. The chip also supports advanced 3D graphics modes such as trilinear texturing. The card provides a 15-Mbyte frame buffer, which supports high resolution (1280 x 1024 pixels) running a 24-bit true-color double-buffered display with a full 24-bit Z-buffer and support for OpenGL features such as stencil planes and alpha planes. Smaller frame buffers require tradeoffs between these features, such as requiring lower resolution or demanding that you choose between true color and double buffering.

Q: What are your plans for improving graphics performance in the future?

A: The best way to improve graphics performance is with a still-faster processor. Faster Alpha 21264 processors based on new semiconductor manufacturing techniques will soon be available. Upgrades to the graphics subsystem affect pixel fill rates and 2D performance--even with 3D graphics, there are still a lot of 2D operations for windowing and the user interface. The PCI bus is adequate for the 4 million triangles/sec of the PowerStorm 300, but higher geometry rates will need a faster graphics bus, such as AGP.

The host processor of the XP1000 runs the application and performs geometry calculations. The PowerStorm 300 graphics card performs pixel-level operations and display.


IBM Intellistation

By Rich Anderson, IntelliStation Performance Engineer, IBM

Q How does the IntelliStation handle graphics?

A: Our architectures treat the graphics task in an assembly-line fashion. At each stage specific tasks occur, and the results are passed to the next stage.

There are three main stages. At the first stage, the application manages its model data. At the second (or geometry) stage, the triangles that approximate the model are moved and lit. At the final (or raster) stage, this data is compressed into a 2D picture that appears on your computer screen.

We have two 3D graphics architectures, an advanced solution called the IBM Fire GL1 and a maximum-performance solution called the Intense3D Wildcat 4000. Both architectures use the system processor and memory to handle stage one. Both architectures also use a custom chip and separate memory to handle stage three. The key difference is in stage two, the geometry stage. The IBM Fire GL1 uses the system processor to do the geometry processing. If the user has a second processor, the lighting task is separated and run on the second CPU. The Wildcat card has a special-purpose chip to perform the geometry stage. It was designed for this task, so it performs far faster than the system CPU.

Q: What are the advantages of this architecture?

A: The key advantage is the amount of data the workstation can handle due to the way we dedicate bandwidth to the subsystems that need it most. The assembly-line approach allows each computer chip to access memory without interfering with other chips. This is important because current technology limits the number of instructions a single chip can hold, and the amount of memory that a single chip can access.

Our current workstation graphics can communicate with the main system memory at over 500 Mbytes/sec. Then each individual graphics stage calculates and generates images and data at much higher rates. For instance, the IBM Fire GL1 has 3.2 Gbytes/sec of bandwidth, and the Intense3D Wildcat 4000 has 4.5 Gbytes/sec of bandwidth for stage three.

Q: How do you plan to improve graphics performance in the future?

A: This year we plan to push two strategies: scale our graphics solutions, and lower the per-seat cost of advanced 3D support. Scaling our graphics lets designers increase their model size and geometry complexity. These improvements will allow models to move more smoothly and appear more realistic. We will also be the first to deliver a single 256-bit graphics chip with full OpenGL 1.2 acceleration, thereby reducing the cost of 3D workstations significantly.

For the IBM IntelliStation, the system processor and memory manage model data, and a custom chip and separate memory perform rasterization. The AGP 2X graphics box represents either the IBM Fire GL1 card, which uses the system processor to do geometry processing, or the Intense3D Wildcat 4000 card, which has a special-purpose chip to do this task.


A brief history of NT workstation graphics

By Peter ffolkes, Industry Analyst, Dataquest, San Jose, CA

Up until last year, most of the graphics in Intel NT workstations were plugged into a PCI slot. And that was reasonably fine. PCI has a theoretical maximum bandwidth of 133 Mbytes/sec, and that was fast enough for the speed at which Intel-based systems could drive graphics. The only limitation was that you needed to have the physical memory for such tasks as texture mapping on the graphics card. That requirement added expense. Some vendors did have more than one PCI bus on their platforms, so you could dedicate one PCI bus to graphics and have disk I/O on another one.

On the NT side, the standard Intel architecture is based on the Pentium II or Xeon with either the 440BX or 440GX chipset. Both support PCI and Intel's dedicated graphics bus called AGP. What you're getting on AGP 1X is twice the bandwidth of PCI--266 Mbytes/sec. But nobody used that because the first implementation that came out supported the 2X facility, which took the speed up to 533 Mbytes/sec. AGP 2X allowed designs that would enable a workstation's main memory to be used for texture, which worked reasonably well for the gaming market because it kept costs down. However, it was not supported in Windows NT--just Windows 95 and 98.

The next generation of systems--even at 533 MHz--didn't really deliver much of an appreciable benefit to the workstation community because PCI was still fast enough to handle things. As we move forward, most vendors making graphics cards for NT workstations have adopted AGP because it's fast and it's a dedicated graphics slot. But these vendors still need dedicated texture memory on the card because only when NT 5 comes along will we get the memory support to be able to use main memory for texture and other advanced graphics tasks.

During the coming year, expect to see Intel take AGP up to 4X, which should be about a gigabyte per second of bandwidth dedicated for graphics.


NT WORKSTATION GRAPHICS COMPARISON


Company


Compaq


IBM


Intergraph


Silicon Graphics

Note: Product information from Dell and Hewlett-Packard was unavailable at press time.


Intergraph TDZ 2000

ViZual Workstation

By Clive Maxfield, Member of the Technical Staff, Intergraph Computer Systems

Q: How does Intergraph's TDZ 2000 handle graphics?

A: Intergraph's TDZ 2000 workstation family supports both industry-standard AGP- and PCI-based graphics accelerator cards (2D and 3D), allowing customers to select the optimum card for their application requirements and budget.

With regard to graphics subsystems, the highest-performing 3D graphics available on NT today are found in Intergraph's Intense 3D Wildcat 4000, according to some industry benchmarks. Unlike most graphics systems, the Wildcat 4000 geometry accelerator (which has more transistors than a Pentium II and can perform 3 billion floating-point operations a second) is driven from the AGP bus. This type of acceleration allows multiple rasterization/texture processing cards to be connected to the PCI bus for multiple-screen support.

TDZ 2000s with Intense 3D Wildcat graphics also use Intergraph's DirectBurst technology, which allows the CPU to burst data directly to the geometry accelerator without tying up main memory. Similarly, each Intense 3D Wildcat 4000 rasterization card has 64 Mbytes of high-speed texture memory and dedicated fast and wide texture buses, allowing simultaneous accesses to the texture RAM and frame buffer.

Q What are the advantages of this architecture?

A: With some system architectures, all of the data (application, graphics, and I/O) passes through main memory, thereby impacting total system performance. With the TDZ 2000, a high-speed dedicated frame buffer and texture memory offload the main system, leaving it free to perform other tasks. Even though it might appear that the single main-memory architecture would provide larger internal bandwidth, shared access to a single blob of memory is in fact performance limiting. Multiple pieces of memory strategically placed in a workstation architecture provide unrestricted access to memory, thus maximizing performance. For example, the power performers of the TDZ 2000 family can provide a total system bandwidth of 6.0 Gbytes/sec.

Q: What are your plans for improving graphics performance in the future?

A: We will continue to design next-generation systems using the latest and fastest processors, memory devices, and peripherals. We also continue to invest in driver technology, as well as in our relationships with software vendors and peripheral and BIOS manufacturers.

For the IBM IntelliStation, the system processor and memory manage model data, and a custom chip and separate memory perform rasterization. The AGP 2X graphics box represents either the IBM Fire GL1 card, which uses the system processor to do geometry processing, or the Intense3D Wildcat 4000 card, which has a special-purpose chip to do this task.


SGI Visual Workstation

By Zahid Hussain, Chief Engineer, Workstation Div., Silicon Graphics Inc.

Q: How does SGI's new Visual Workstation handle graphics?

A: The Silicon Graphics Visual Workstations use the Cobalt graphics engine to deliver 2D and 3D graphics performance. Cobalt is optimized for standard APIs including OpenGL, OpenGL Optimizer, Direct Draw, and GDI. Cobalt has advanced OpenGL extensions that are available to software partners so that they can take advantage of the latest graphics and media features from SGI.

Cobalt consists of a geometry pipeline that performs transforms, lighting, and line and triangle rasterization--coupled with a rasterization pipeline that performs 3D pixel rasterization operations such as shading, texturing, z-buffering, and stenciling. Cobalt enables dynamic allocation and assignment of graphics memory (including frame buffer, z buffer, overlay, and texture) in the system memory. This capability provides the system with sufficient memory to handle the operating-system requirements, as well as the graphics requirements like the frame buffer to support resolutions as high as 1,920 x 1,200 pixels at 32-bit color depth, and dedicated texture memory of up to 90% of the system memory capacity.

Q: What are the advantages of this architecture?

A: The Visual Workstations are built on an Integrated Visual Computing (IVC) architecture that supports a native implementation of Microsoft Windows NT. The highly tuned IVC architecture leverages dual-capable Intel Pentium II processors (Silicon Graphics 320) or quad-capable Pentium II Xeon processors (Silicon Graphics 540) for the industry's most scalable graphics workstation. High-speed interconnects include a 3.2-Gbyte/sec memory bus that is six times the bandwidth of AGP 2X, and a 1.6-Gbyte/sec bidirectional I/O bus that is 12 times the bandwidth of a traditional 32-bit PCI bus. The system also offers two independent 64-bit PCI buses for increased bandwidth and reduced bus contention to support high-speed storage devices, multiple networking protocols, and other PCI-based peripherals.

Because of their integrated hardware-accelerated Cobalt graphics, fast networking, built-in audio and video, and high-speed, high-bandwidth bus infrastructure, the workstations can efficiently handle large amounts of data. The Visual Workstation system particularly suits applications that require the creation, processing, and manipulation of complex, real-time visual data. Tight integration of system components lowers overall cost and provides both the bandwidth and the processing power to support extremely large 2D images, 3D models, visual databases of geometry and textures, and multiple streams of uncompressed digital video I/O.

Q: What are your plans for improving graphics performance in the future?

A: By leveraging the advancements of the Intel Pentium processor family, the Silicon Graphics Visual Workstation will achieve improvements in graphics performance as measured by Glperf, Viewperf, and applications benchmarks. This accomplishment is in contrast to accelerated AGP cards with geometry accelerators that are CPU independent. The Cobalt graphics engine is designed to support future processors and front-side bus architectures from Intel, thereby providing for even greater bandwidth and performance. In addition, future feature enhancements to Cobalt will deliver a richer visual experience.

SGI's Unified Memory Architecture embeds graphics acceleration into the ASIC chipset for its Visual Workstation line. The chipset connects with main memory via a 3.2-Gbyte/sec bus, which has six times the bandwidth of the 512-Mbyte/sec AGP 2X bus.

Sign up for the Design News Daily newsletter.

You May Also Like