The development of a pharmaceutical product requires the collection of information during preformulation studies, product formulation, process development, scaleup and manufacturing. A large amount of information of different types, ranging from raw data to lab reports to sophisticated math models, has to be shared, utilized and modified by humans for decision making. This functionality makes necessary an informatics infrastructure which supports different activities by streamlining information gathering, data integration, model development and decision making (Venkatasubramanian et al, 2006). The foundation of such an infrastructure is the explicitly and formally modeled information, called an ontology.
The Purdue Ontology for Pharmaceutical Engineering is presented here. Information is captured in the form of concepts and the relations between concepts. The ontology is centered on the concepts of materials, experiments and properties and builds on previous work. Through the definition of the ontology, several functions that are difficult to perform like complicated searches, association storage and reasoning are made available.
The Purdue Ontology for Pharmaceutical Engineering includes several components as shown in Figure 1. The expert knowledge is modeled in the form of guidelines in the ontological infrastructure. A guideline models procedural knowledge, which consists of decision logic, information look-up, evaluation of decision variables and provision of recommendations. These components are captured in the POPE ontology.
The POPE ontology also describes mathematical knowledge, which consists of the mathematical equations as well as the underlying assumptions on the physical, chemical or biological phenomena. This separates the declarative and procedural components of mathematical models creation, manipulation and solution. The declarative part consisted of two main ontologies, one which represents the details of a model (Model Definition) such as the model equations and state variables, and the other which represents the details of its use in modeling a specific processing step (Model Use).
The model equations are described by the Mathematical Markup Language (MathML). There are two dialects in MathML the presentation markup is used for rendering an equation, for example in a web browser or printing in a document and the content markup defines the semantics (meaning) of the equations. A suitable MathML editor such as WebEQ can be used for writing mathematical equations. The procedural part deals with solving the model equations.
Figure 1: Overview of the POPE ontology
The information ontologies consist of ten categories:
These ontologies are inter-related as shown in Figure 2. The first five ontologies are beginning points in the ontology navigation, but all ontologies are visited during the navigation.
Figure 2: Information components of POPE
In POME, the material is described in terms of its substance entity (environment independent) and its phase system entity (environment dependent) and its role in the mixture (e.g. for solids: flow aid, diluent etc). The phase system would be described by the fraction and identity of the phases comprising it (phase composition). Each phase would have a chemical composition, which describes the species and their relative abundance in the given phase as well as the environmental conditions e.g. temperature, pressure. The species are described in the PODS. Substance properties like molecular mass and critical temperature are included in the POMP.
PODS makes use of common molecular fragments from a search of the most common drug degradation reactions and their participants. Each fragment is part of a "fragment-entity" which might participate in a reaction and is connected to (or identified as) a backbone group. This ontology can be coupled with the PORE to represent chemical systems and with POME to describe a material during product development.
The concept of a reaction may include physical and chemical changes. PORE was developed to represent reactions as interactions between fragments phase systems. Each reaction would have a reaction_context, which describes the pertinent descriptors of the reaction e.g. at what temperature it occurs, at what pressure, pH etc. Several restrictions such as the requirement of at least one reactant and one product for a reaction to exist were put in place.
POMP describes properties at different levels. The material may have substance (molecular level) or phase system properties. Solid properties were described at two levels; particle properties (pertaining to single crystals or amorphous particles) and powder (bulk) properties like particle size distribution. Each property value would be correlated to a set of environmental conditions during measurement (e.g. temperature, pressure) and a source (experiment, mathematical model or literature). Interactions (chemical and ph sical reactions) would be included in material properties.
PODE was developed to link experiments to material properties and reactions in a semantically rich manner. Experiments have some generic characteristics which are independent of the property to be measured: these include the time and place of the experiment as well as the identity of the people who performed the experiment. Equipment and experimental procedures were modeled as a collection of actions, which could be observation /measurement actions, processing actions (e.g. mix, separate) or operation actions. The connection between pieces of equipment was captured through equipment adjacency. Each equipment has a setting and the experiment is performed under a given context.
Venkat Venkatasubramanian* (Prof.)
Gintaras Reklaitis (Prof.)
Shuo-Huan Hsu (PhD)
Chunhua Zhao (PhD)
Girish Joglekar (PhD)
Ankur Jain (PhD)
School of Chemical Engineering
Forney Hall of Chemical
480 Stadium Mall Drive
West Lafayette, IN 47907-2100
V. Venkatasubramanian, C. Zhao, G. Joglekar, A. Jain, L. Hailemariam, P. Suresh, V. Akkisetty, K. Morris, G.V. Reklaitis (2006) Ontological Informatics Infrastructure for chemical product design and process development, Computers and Chemical Engineering, CPC 7 Special Issue, 30(10-12), 1482-1496.