MAIChem™ - Document Indexing with Chemical Names
Finding chemical names in documents is a challenge because there are an unlimited number of potential chemical compounds and a variety of ways that a particular compound can be named. It is thus impractical to simply match names against text.
MAIChem solves this problem by processing the text against regular expressions that match typical chemical morphemes, such as "hydro" or "amine" to see if they occur in words. Following this initial analysis, additional algorithms are applied to differentiate between non-chemical names that use the same morphemes.
For example, the morpheme "hydro" appears in non-chemical words such as "hydrophobia" and also legitimate chemical names like "hydrogen sulfate." With all potential chemical morphemes in a document identified, MAIChem uses the morphemes as building blocks to ascertain chemical names from non-chemical text strings. The system also generates a list of synonyms or variations on the names. Your knowledge workers get technical documents indexed with chemical names for:
- Review and analysis
- Automatic indexing or Machine-aided indexing
- Content discovery
- Faster information retrieval
- Connecting to structure diagramming software
- Enhanced metadata management
MAIChem can be set up to run as a stand-alone program or over a network with clients using a web browser.