Software that does your research for you

DN Staff

September 20, 1999

9 Min Read
Software that does your research for you

Internal documents. Magazines. Trade journals. White papers. Conference proceedings. University research papers. Patents. Catalogs. Thousands of web sites.

The engineer's reading list is endless. And the list is multiplying exponentially. With time constraints, deadlines, quicker time-to-market demands, how does one keep up with evolving technology and research?

The answer: with semantic processing software. Semantics is the relationship of words. Using highly sophisticated algorithms, the software analyzes the network or structure of sentences seeking out the subjects, actions, and objects of the actions. This is very similar to the dreaded sentence diagramming you likely paid little attention to in seventh grade English. After this numeric decoding of sentences, semantic processing compiles the information into whatever format the user has designated.

Companies such as IBM are rapidly developing software that would apply this process for researching and compiling all types of computer data, making mounds of information easily accessible via a personal computer.

To date, however, only CoBrain, a software package from Invention Machine (Boston), is commercially available. CoBrain does in minutes what it would take an engineer days to do: research hundreds of documents, read them, search for relationships between concepts, and create a structured knowledge-base optimized for sharing within a corporation. "This ability greatly enhances the engineers' productivity in problem solving and new product/process development," says Valery Tsourikov, CEO and chief scientist of Invention Machine. With the software reading, searching, and indexing the information for the engineer, he or she has more time for conceptual exploration and testing and can create higher-quality and more innovative concepts. Not to mention capturing intellectual property (and IP creators) for leveraging throughout the enterprise, he continues.

The software's engine or database contains more than one million words. Using these words and their relationship to one another, the software searches for documents that contain a user's request, such as "stabilize emulsions." While any self-respecting keyword search engine, such as Lycos or Yahoo, can do this, Co-Brain goes further. The software reads these documents and comes back with folders of information indexed by method. Open a folder and the user immediately sees an explanation of the function as well as a link to the original document.

Developed by mechanical engineers sympathetic to the "keeping up with technology" problem, CoBrain organizes and displays the information in a logical, problem-solution manner "the way an engineer thinks," says Thomas Murphy, Vice President of Information Manufacturing Corp. (Rocket Center, WV), who has been beta testing the software, along with Unilever, Motorola, Medtronic, and DaimlerChrysler.

"We are testing CoBrain against data we have already captured," says Murphy. "Invention Machine has done a tremendous job with science and engineering information and we are testing it's application in other non-science and engineering domains." The program captures the lexicon and vocabulary so it can intelligently capture relationships between ideas.

Murphy asked CoBrain to process four 100-page papers about environmental protection for water resources, written by different authors. Within 40 seconds, CoBrain read and organized the information into 80 action-object-subject associations.

"It would have taken me at least a week to read through and understand the concepts embedded in these papers," says Murphy, who is planning to partner with Invention Machine.

The software presents a structured problem solving exercise by forcing the user to think through the initial search functionally. A user can either tell the program to: "Go read these documents and bring back all main ideas," or he or she could create a search stream and look for all documents on say, motion control.

Having this choice of information is important, says Tsourikov. "You may think that you know what method you want, but once you see all your options, you may decide on something completely different."

One can research and summarize a library of information from the last 30 years in days instead of years, Invention Machine claims. The software can process 1 Mbyte of information per minute. The time is a function of the amount of information one asks CoBrain to research, as well as where it is located. For example, internal documents would not take as long as a search on the U.S. Patent office for "reducing footprint."

One constraint of the system is that the information being researched must be electronic. Up until the last decade or so, most companies generated only paper. That's where Information Manufacturing Corp. fits in. Its employees take raw data in any form and transform it to accessible information. Essentially, this means they turn hard copy into an electronic format, processing up to three million pages a month. Murphy describes his company as an information factory. In less than 12 months, the company has grown from twenty to well over 100 employees.

Technology transfer. People talk about knowledge and technology transfer, but there has never been a good way to pass along the information, says Murphy.

Major complex product development and integration corporations use integrated process and product teams to accomplish this. But the team is limited to the knowledge of its members. With semantic processing, a company can quickly and efficiently combine internal knowledge with external, which is abundantly available with the power of the Web, Murphy continues. "Such a breakthrough is fundamental to how technology and information can and will be distributed among corporations and engineers."

"It's like adding 2,000 of the world's top subject matter experts to your team but not as hard or as expensive," Murphy suggests.

A company can also use CoBrain to search internal documents. "We all tend to reinvent the wheel every time we start with a new procedure or project," says Murphy. "This pro- gram says, 'Wait...here's what we have already and how we accomplished it previously.'"

In short, this process extends the box that people tend to work within, says Murphy. "CoBrain researches and processes sources of data one would never read because of lack of time, and suggests methods that you would typically be unfamiliar with."

Most design decisions are made within the first three months of the product development cycle, and these account for almost 70% of the life-cycle costs associated with the product, he continues. Engineers who take advantage of CoBrain's solution-searching and problem solving capabilities can make smart decisions up front, before committing to a design.

"I've played in the Enterprise Integration and Product Data Management (PDM) areas for years trying to bring information and people together, but have never been able to do it like this technology does," says Murphy.

"It doesn't just bring information to the enterprise, it brings new ideas and solutions."


What this means to you

  • Summarize vast amounts of material quickly

  • Innovation possibilities

  • Quicker time-to-market

  • A controllable access to worldwide patents and research


Step 1

An engineer performs hypothetical search on a company's intranet for all the documents pertaining to: Motion Control

Step 2

CoBrain searches the documents, reads them, indexes them, and compiles the information into folders, such as:

Motion Control: linear motors

  • Motion Control: actuators

  • Motion Control: ball screws

  • Motion Control: hydraulics

  • Motion Control: pneumatics

Step 3

The engineer is interested in ball screws. He clicks this folder. Under the heading: Ball screws, different types are listed such as: Telescoping, Hollow, Greaseless, Miniature

Step 4

The engineer chooses Telescoping

Step 5

CoBrain shows a description of telescoping ballscrews, what they are, a diagram of what they look like, and a hyperlink to the document that originated the information


Semantic Processing: reaching for the Holy Grail

Semantic processing is the Holy Grail of artificial intelligence, says Alan Marwick, manager of text analysis and advanced research at IBM TJ Watson Research Center (Yorktown Height, NY). For years, IBM researchers tried unsuccessfully to use computer programming to capture the meaning of the written or spoken language by analyzing an entire document or speech. By using a narrower or shallower process, they are making headway. Programs search for semantic "nuggets" or key words and extract these from texts, making logical connections.

For example, semantic processing analyzing an e-mail about your upcoming office party may pull out four key words: Office Party. Monday evening. While the program has no concept of what these words mean, it knew that this was the important information on the message.

A news story with the headline, "A Chill wind blows in Texas," provides no information about what the article is about, Marwick continues. But a semantic processor can summarize the story and tell you that the oil industry in that state is in trouble. It can assign labels to the information and catalog it as well.

"Text classification is better than human beings for filing documents," he says. The computer is more consistent, never tires, and is more accurate.

Internally, IBM developed such a system to help company employees located all over the world deal with the seven million documents on the IBM Intranet. Marwick says they use lexical navigation to build an internal catalog system, the WebCAT, to help people find, sort, and keep track of the information.

In 1998, the company released Intelligent Miner for Text, a tool kit for programmers who want to write software capable of searching text.

The value in this technology is really a double whammy, says Marwick "We are all in a state of information overload. We are constantly interrupted by incoming e-mails and electronic documents and need more tools to deal with information faster." A semantic processing program could easily search and summarize information, and help people weed through the "junk."

Semantic processing will be the basis of knowledge management programming in the future, says Marwick. In fact, Lotus Notes Release 5 has a version of such a process, which creates a network of connectors between people, organizations, and companies.

Eventually even the present phone system of "Press 1 for company directory," or "Press 2 for sales," will disappear. Semantic processing will enable users to communicate with a computer like you would an operator, says Marwick.

Sign up for the Design News Daily newsletter.

You May Also Like