From workstation to supercomputer, block by block

DN Staff

January 22, 1997

16 Min Read
From workstation to supercomputer, block by block

Mountain View, CA--It's a simple enough idea: Create a computer system that can be pieced together like building blocks. So, instead of making many different machines of all different sizes--workstation, minicomputer, mainframe, supercomputer--you have modules.

Need a workstation? Buy one module. Need more power? Buy more modules, and stack them. And from a manufacturing standpoint, the factory turns out more units of a single product, instead of smaller numbers of different products. That's the theory behind the Silicon Graphics Origin2000, which can go from a deskside cube holding from 1 to 8 processors, to racks of such cubes using more than a thousand. "It's something quite a bit different than they've had in the past," says Peter Lowber, an analyst with DataPro. "I think it's very interesting."

The project quickly became known as "Lego" around SGI. "Two of us even independently went out to Toys 'R' Us and got the same Lego kits," Product Designer Jim Ammon recalls with a laugh. One designer even used them to help visualize a 512-processor system.

But while the concept may be basic, engineering implementation was daunting. How do you connect multiple processors so they act together as one, access the same memory, and move data around as fast as one powerful CPU? It's not terribly difficult with two processors; four is also common. But when you start tying together 16, 32, or 64, connectivity problems mount. "Taking modular workstations and building a supercomputer out of them by cabling them together was a new concept," says Product Designer Richard Singer.

As workstation modules mount, cables not only run between neighboring racks, but also from top to bottom diagonally. "It's quite a cabling nightmare," he sighs.

There were electronic and architectural issues to be solved, of course; but also an unusually tough number of mechanical challenges. "It's not always that a computer relies so much on mechanical design," according to Sally Abolitz, product designer. "We got to flex our muscles more than we usually do."

The resulting system features an Origin2000 cube that can hold up to four processing cards, with one or two processors per card. For the modules to stack, their outer "skin" is popped off and they're placed in specially designed racks.

The racks themselves incorporate new CrayLink Interconnect technology, allowing high-speed data flow among the processors, and a distributed directory system for sharing memory. All this makes it possible for cubes tied together to act as if they were a single, high-powered supercomputer, SGI engineers explain.

Connecting conundrum. One thing the engineers quickly discovered when developing the Origin's Scalable Shared-memory MultiProcessing (S2MP) architecture: conventional connectors for the components wouldn't do. "We're used to tolerances of 30 or 40 thousandths of an inch," she says. "This is three thousandths of an inch."

The design team decided to use CPOP flex-circuit technology from IBM, using tiny dendrites--almost like microscopic Brillo pads. The tiny balls have low impedance and offer high-speed signaling, helping the high-speed modular design so systems can grow "in a seamless fashion," Singer explains. "Without this connector, you can't have high enough speeds."

The major issue: how to make sure plug-in components matched perfectly enough so they would properly blind mate. "This was thehot issue," according to Abolitz. "There was a lot of pressure to get that solved."

For one card, with the node's electronics, SGI designers ended up using long screws to pull the card in so it would fit perfectly into place. And for another card, governing data input/output, a cam and hook on the card's underside ensures it is properly guided.

Stack 'em up. Inside the deskside modules themselves, there would be no cables connecting electronic components. "This was a big leap forward for us," Singer says. "We had cables up the ying yang in our previous systems." Instead, everything would be blind mated.

That cut out cabling between most components within each module; but cabling betweenthem is considerable. There are up to 16 cables between racks for a 128-processor system; and the cables themselves are heavy and thick. The cables use proprietary technology developed by Gore, with delicate insulation that can be damaged if the cable has less than a 1.25-inch bend radius.

What to do with the mass of cables coming out of every module on a large system? Based on the computer's architectural design, there can be only a 3-meter-long cable between any two points; and the system can theoretically hold up to 1,024 processors (SGI can currently ship systems with 128 processors, eight racks tied together plus a ninth "metarack;" soon 256-processor systems will be available).

Special large cable-guiding systems along the front sides of the rack feature helical moldings that snap in a track that is integral to the rack frame's structural extrusions. The hollow extrusion also serves as a cable conduit. This same extrusion features a recess that captivates large, vacuum-formed side panels.

"The side has no fasteners," Singer notes. "It's like the reverse of a Tupperware top, peeling in or out of recesses.

"All the functionality is in the feature-rich profiles of the extrusions," he explains. An ABS extrusion snaps onto the bottom of the rack, indexing the side panel vertically.

As the design deadline approached, Singer admits, "I was a little nervous about the extrusions. I'd never used them before on a product, and I went wild with them on this one."

The first physical model of a rack's cable-management system was done using cut lengths of PVC drainpipe, with hemp ropes substituting for the masses of cable. The subsequent prototype used a Cubital modeling system and actual CrayLink cables. "That was a day of big surprises," Singer recalls. "The prototypes were brittle, things were cracking left and right. I was gluing my fingers together with super glue." However, while the prototyping session may have been nervewracking, the design worked fine in actual production.

For the first time, SGI used a technique called V-Process from Harmony Castings, Harmony, PA, to create large, complex, low-volume parts for the system. Die casting is typically used for this type of part, but time was of the essence for producing the card-guide "cage" to help house printed circuit boards.

V-Process heats a thin plastic film and puts it over a pattern, which is then surrounded by a flask filled with sand. The sand is vibrated so it tightly fills the pattern. A second sheet of film goes over the flask, a vacuum draws out remaining air, and a completed mold is stripped from the pattern. Harmony then pours aluminum into a mold created this way; as long as the mold is held together in the vacuum, it keeps its shape. After the mold cools, the vacuum is broken, and sand and completed castings fall free.

Each resulting V-Process part costs about five times more than conventional methods, but the tooling is one-tenth the cost--ideal for low volumes. Most important: Production took only five weeks, compared to an estimated 20 weeks for conventional die casting.

The deskside system's top consists of plastic, while the bottom includes sheet-metal components that mate with a special pallet used to move the system around the manufacturing plant. The pallet stays with the system during final shipment, when it serves as a "sled" to slide the module onto its proper site. Cushion packaging is shaped to double as a ramp, for final sliding off the pallet.

Appearance is also important for a company like Silicon Graphics, known for generating dazzling computer displays and systems for high-profile entertainment-industry users. Engineers say a great deal of attention was paid to industrial design; and it turned out that ID and mechanical needs often fused to create a single solution. "There is a lot of blending whether something is mechanical or cosmetic," Ammon says.

He worked on designing the front door to the module--important to allow easy access to the system while also keeping the stylized look. "Eighty percent of the design time was spent on that one side," he says. The final decision: a door that would slide up and down, not open out, so there'd be no worry about it breaking off. Traditional rods and gears can cause such doors to stick; instead, the door has gear racks on both sides, which engage a gear and rod mechanism forcing both sides of the door to slide in unison. A constant-force spring and damper give the motion a constant speed. "The entire door is the actuator," he explains. This prevents rack and bind problems inherent for a door wider than its length.

Structure and heat. SGI designers used Fluent and SDRC analysis pack-ages to model a large multi-fin heat sink. Pro/ENGINEER was tapped for most of the basic CAD work.

To keep the module strong while allowing access to the system from both sides, engineers designed in a "midplane," instead of a conventional backplane, so the cube could open from either side.

Customer demands forced a major revision to the design. A year into the project, the idea of a narrow, 4-processor building-block module was scrapped because it wouldn't fit into standard-sized 19-inch racks, or have enough I/O and flexibility. With half the time gone, engineers were suddenly faced with squeezing twice as many processors into a different-sized box--and no extension to their deadlines.

"Previous design directions were scrapped," Ammon recalls. "Some me-chanical concepts were adapted from the previous chassis design."

Out the door. SGI engineers worked through the summer to finalize the design--it wasn't until August that one of the card connector issues was finally hammered out, just weeks before the system was to ship.

They also worked with a bevy of outside companies to jointly develop various portions of the system. "We worked with all our partners in a collaborative environment, rather than a 'go away, and come back with a design' approach," Ammon notes.

When the Origin finally began rolling out the door: joy, relief, satisfaction. "It's great to work on a project when you have a clean slate," Abolitz says. "It's been fun."


Building a 'hypercube'

The theory behind Silicon Graphics' building-block computing uses multi-dimensional hypercubes. While it's easy enough to visualize a conventional 3-D cube, 4 and more can be tougher to construct. According to a paper by SGI's Jim Ammon, written to help outside consultants understand the project, a technique he dubbed "double and stretch" helps:

Take a familiar 3-D cube, and duplicate it onto itself with connections between the twin vertices. The twin cubes are pulled away from each other, in the direction of the new 4th dimension, stretching the links between twin vertices. The resulting structure is two 3-D cubes with 8 links connecting the common corners between the cubes.

"If we had a 4-D piece of paper, we could have drawn the 4-D cube without any of the lines crossing," Ammon notes. "We cannot construct a true 4-D structure in our 3-D universe, but we can build a system that maintains the same vertex-to-vertex connectivity.'

The 4-D hypercube includes 16 vertices, 64 processors (4 x 16 vertices) and 32 CrayLink connecting cables. When the 4-D cube is "doubled and stretched" again, the 5-D version features 32 vertices, 128 processors, and 80 CrayLink connections. A 6-D version has 64 vertices, 256 processors, 192 CrayLink cables, and so on. Vertices are actually router chips, to which 4 processors (2 node boards) can be connected.


Supercomputer design the way it was

Silicon Graphics' modular design is a major leap from the way engineers built the first supercomputers. The following is an excerpt from The Supermen: The story of Seymour Cray and the technical wizards behind the supercomputer, by Design News Senior Regional Editor Charles J. Murray. The book, published by John Wiley & Sons, will be available in bookstores in February.

Atop a sandstone bluff, in a three-bedroom cottage overlooking Lake Wissota, Seymour Cray wrestled with the design of the CRAY-2. Although few engineers in Chippewa Falls knew it, Cray had never stopped working on the machine. So while the crew in Boulder toiled away at their version of the CRAY-2, Cray continued to work on his.

In 1981 he spent most of his time at the cottage, trying to piece together the mix of technologies that was whirling around in his brain. To help him, he had set up his own design lab there: In one bedroom there was a big Data General computer, a real monster about six feet high, with several cabinets full of electrical racks.

Cray used the mainframe to run the software programs that aided him in the design of the circuit boards for the CRAY-2. With the software, he could lay out a complex array of electronic components and foil lines on the computer screen. Then, if he wished, he could shift the parts around on-screen without having to rebuild the hardware every time he wanted to make a change.

The mainframe helped Cray to improve the design process, but it was not without its penalties. In 1981, CAD was still in its primitive stages. So Cray needed to augment it with dozens of little personal computers, called SuperBrains, that he also kept inside the cottage. At any one time, Cray usually ran software on about half a dozen SuperBrains and on the mainframe. The SuperBrains were scattered all over the cottage--in the bedrooms, living room, and kitchen. Because personal computers were unreliable in 1981 and were constantly fizzling out, he stored extras in the garage and at the Hallie lab, where a technician was on call to fix the duds.

Cray liked the setup at the cottage. With Cray Research growing larger and more successful, the cottage was a safe haven where he could work in blissful isolation for days on end, away from the inevitable corporate distractions, away from the concerns of a thriving business.

Still, the woods around Lake Wissota were a less-than-ideal locale for designing supercomputers. Power outages were commonplace. Cray never knew when the power might crash, causing him to lose hours worth of work, so he installed a utility power system in his garage. He also had to install a powerful air conditioner to cool the mainframe.

None of this deterred him in the least. Nor was he deterred when he decided the operating system on the Data General mainframe wasn't right for his needs. He simply sat down and rewrote thousands of lines of operating system code. Ironically he did all of this--the mainframe, SuperBrains, backup power, air conditioning, and new operating system--to save time. Supercomputer design was such a complex process that an upfront investment like this could help in the long run.

For more on this
technology, go to
www.sgi.com/
tech/whitepapers/
hypercube.aspl

Sign up for the Design News Daily newsletter.

You May Also Like