Software that does your research for you

By: 
September 20, 1999

Internal documents. Magazines. Trade journals. White papers. Conference
proceedings. University research papers. Patents. Catalogs. Thousands of web
sites.

The engineer's reading list is endless. And the list is multiplying
exponentially. With time constraints, deadlines, quicker time-to-market demands,
how does one keep up with evolving technology and research?

The answer: with semantic processing software. Semantics is the relationship
of words. Using highly sophisticated algorithms, the software analyzes the
network or structure of sentences seeking out the subjects, actions, and objects
of the actions. This is very similar to the dreaded sentence diagramming you
likely paid little attention to in seventh grade English. After this numeric
decoding of sentences, semantic processing compiles the information into
whatever format the user has designated.

Companies such as IBM are rapidly developing software that would apply this
process for researching and compiling all types of computer data, making mounds
of information easily accessible via a personal computer.

To date, however, only CoBrain, a software package from Invention Machine
(Boston), is commercially available. CoBrain does in minutes what it would take
an engineer days to do: research hundreds of documents, read them, search for
relationships between concepts, and create a structured knowledge-base optimized
for sharing within a corporation. "This ability greatly enhances the engineers'
productivity in problem solving and new product/process development," says
Valery Tsourikov, CEO and chief scientist of Invention Machine. With the
software reading, searching, and indexing the information for the engineer, he
or she has more time for conceptual exploration and testing and can create
higher-quality and more innovative concepts. Not to mention capturing
intellectual property (and IP creators) for leveraging throughout the
enterprise, he continues.

The software's engine or database contains more than one million words. Using
these words and their relationship to one another, the software searches for
documents that contain a user's request, such as "stabilize emulsions." While
any self-respecting keyword search engine, such as Lycos or Yahoo, can do this,
Co-Brain goes further. The software reads these documents and comes back with
folders of information indexed by method. Open a folder and the user immediately
sees an explanation of the function as well as a link to the original document.

Developed by mechanical engineers sympathetic to the "keeping up with
technology" problem, CoBrain organizes and displays the information in a
logical, problem-solution manner "the way an engineer thinks," says Thomas
Murphy, Vice President of Information Manufacturing Corp. (Rocket Center, WV),
who has been beta testing the software, along with Unilever, Motorola,
Medtronic, and DaimlerChrysler.

"We are testing CoBrain against data we have already captured," says Murphy.
"Invention Machine has done a tremendous job with science and engineering
information and we are testing it's application in other non-science and
engineering domains." The program captures the lexicon and vocabulary so it can
intelligently capture relationships between ideas.

Murphy asked CoBrain to process four 100-page papers about environmental
protection for water resources, written by different authors. Within 40 seconds,
CoBrain read and organized the information into 80 action-object-subject
associations.

"It would have taken me at least a week to read through and understand the
concepts embedded in these papers," says Murphy, who is planning to partner with
Invention Machine.

The software presents a structured problem solving exercise by forcing the
user to think through the initial search functionally. A user can either tell
the program to: "Go read these documents and bring back all main ideas," or he
or she could create a search stream and look for all documents on say, motion
control.

Having this choice of information is important, says Tsourikov. "You may
think that you know what method you want, but once you see all your options, you
may decide on something completely different."

One can research and summarize a library of information from the last 30
years in days instead of years, Invention Machine claims. The software can
process 1 Mbyte of information per minute. The time is a function of the amount
of information one asks CoBrain to research, as well as where it is located. For
example, internal documents would not take as long as a search on the U.S.
Patent office for "reducing footprint."

One constraint of the system is that the information being researched must be
electronic. Up until the last decade or so, most companies generated only paper.
That's where Information Manufacturing Corp. fits in. Its employees take raw
data in any form and transform it to accessible information. Essentially, this
means they turn hard copy into an electronic format, processing up to three
million pages a month. Murphy describes his company as an information factory.
In less than 12 months, the company has grown from twenty to well over 100
employees.

Technology transfer. People talk about knowledge and technology transfer, but
there has never been a good way to pass along the information, says Murphy.

Major complex product development and integration corporations use integrated
process and product teams to accomplish this. But the team is limited to the
knowledge of its members. With semantic processing, a company can quickly and
efficiently combine internal knowledge with external, which is abundantly
available with the power of the Web, Murphy continues. "Such a breakthrough is
fundamental to how technology and information can and will be distributed among
corporations and engineers."

"It's like adding 2,000 of the world's top subject matter experts to your
team but not as hard or as expensive," Murphy suggests.

A company can also use CoBrain to search internal documents. "We all tend to
reinvent the wheel every time we start with a new procedure or project," says
Murphy. "This pro- gram says, 'Wait...here's what we have already and how we
accomplished it previously.'"

In short, this process extends the box that people tend to work within, says
Murphy. "CoBrain researches and processes sources of data one would never read
because of lack of time, and suggests methods that you would typically be
unfamiliar with."

Most design decisions are made within the first three months of the product
development cycle, and these account for almost 70% of the life-cycle costs
associated with the product, he continues. Engineers who take advantage of
CoBrain's solution-searching and problem solving capabilities can make smart
decisions up front, before committing to a design.

"I've played in the Enterprise Integration and Product Data Management (PDM)
areas for years trying to bring information and people together, but have never
been able to do it like this technology does," says Murphy.

"It doesn't just bring information to the enterprise, it brings new ideas and
solutions."


What this means to you

  • Summarize vast amounts of material quickly

  • Innovation possibilities

  • Quicker time-to-market

  • A controllable access to worldwide patents and research



Step 1

An engineer performs hypothetical search on a company's intranet for all the
documents pertaining to: Motion Control


Step 2

CoBrain searches the documents, reads them, indexes them, and compiles the
information into folders, such as:

Motion Control: linear motors

  • Motion Control: actuators

  • Motion Control: ball screws

  • Motion Control: hydraulics

  • Motion Control: pneumatics


Step 3

The engineer is interested in ball screws. He clicks this folder. Under the
heading: Ball screws, different types are listed such as: Telescoping, Hollow,
Greaseless, Miniature


Step 4

The engineer chooses Telescoping


Step 5

CoBrain shows a description of telescoping ballscrews, what they are, a
diagram of what they look like, and a hyperlink to the document that originated
the information


Semantic Processing: reaching for the Holy Grail

Semantic processing is the Holy Grail of artificial intelligence, says Alan
Marwick, manager of text analysis and advanced research at IBM TJ Watson
Research Center (Yorktown Height, NY). For years, IBM researchers tried
unsuccessfully to use computer programming to capture the meaning of the written
or spoken language by analyzing an entire document or speech. By using a
narrower or shallower process, they are making headway. Programs search for
semantic "nuggets" or key words and extract these from texts, making logical
connections.

For example, semantic processing analyzing an e-mail about your upcoming
office party may pull out four key words: Office Party. Monday evening. While
the program has no concept of what these words mean, it knew that this was the
important information on the message.

A news story with the headline, "A Chill wind blows in Texas," provides no
information about what the article is about, Marwick continues. But a semantic
processor can summarize the story and tell you that the oil industry in that
state is in trouble. It can assign labels to the information and catalog it as
well.

"Text classification is better than human beings for filing documents," he
says. The computer is more consistent, never tires, and is more accurate.

Internally, IBM developed such a system to help company employees located all
over the world deal with the seven million documents on the IBM Intranet.
Marwick says they use lexical navigation to build an internal catalog system,
the WebCAT, to help people find, sort, and keep track of the information.

In 1998, the company released Intelligent Miner for Text, a tool kit for
programmers who want to write software capable of searching text.

The value in this technology is really a double whammy, says Marwick "We are
all in a state of information overload. We are constantly interrupted by
incoming e-mails and electronic documents and need more tools to deal with
information faster." A semantic processing program could easily search and
summarize information, and help people weed through the "junk."

Semantic processing will be the basis of knowledge management programming in
the future, says Marwick. In fact, Lotus Notes Release 5 has a version of such a
process, which creates a network of connectors between people, organizations,
and companies.

Eventually even the present phone system of "Press 1 for company directory,"
or "Press 2 for sales," will disappear. Semantic processing will enable users to
communicate with a computer like you would an operator, says Marwick.

Comments (0)

Please log in or register to post comments.
By submitting this form, you accept the Mollom privacy policy.
  • Oldest First
  • Newest First
Loading Comments...