SSM for Knowledge Elicitation & Representation

Publishing history: This paper was published as Warwick Business School Research Paper No. 98 (ISSN 0265-5976) in August 1993. With extensive revisions and additions it was published as Soft Systems Models for Knowledge Elicitation and Representation in Journal of the Operational Research Society (1995) 46, 562-578.

Abstract
The paper contends that the conceptual modes used in Soft Systems Methodology have an unusual logical status. This enables them to be rendered in modal logic and used as a framework for knowledge elicitation and for the design of knowledge-based systems with learning capability. A theoretical framework is presented which suggests that this can be accomplished in a six stage process. Stage one is systems analysis to determine the nature of the system that is required by the stake-holders in the organization concerned. The modelling device is SSM conceptual models which are produced by an iterative debate amongst the stake-holders. Stage two consists of a Wittgensteinian language game in which an agreed logically precise artificial language is created to describe the problem situation. The modelling device is a Logico-linguistic model which is produced by the stake-holders. Stage three is knowledge elicitation whereby the stake-holders' knowledge of real world events is brought out and added to the Logico-linguistic model to form an empirical model. Areas where knowledge is lacking will also be identified at this stage. It will, therefore, also function as a method of information requirements analysis which will indicate where further empirical research is required. Stage four consists of knowledge representation in which the empirical model is expressed in a formal calculus. Stage five is codification in which the model is expressed in a computer language such as Prolog. Stage six is the process of verification which will continue when the program is used in practice.

Key words

 * conceptual models, Soft Systems Methodology, logico-linguistic models, non-monotonic logic, modal logic, knowledge based systems, Prolog

Introduction
SSM is primarily a methodology for systems analysis. It claims to be relevant to any problem situation involving human activity. The early stages of the method are more concerned with the identification of who is involved in the problem and what the problem is than with the solution of the problem. SSM is not an information system design method as such, but a general problem structuring method that may be used in the production of an information system design as one of many possible solutions.

SSM, therefore, has a versatility not found in mainstream information system design methodologies. The use of an information system design method should be based on some form of system analysis, which indicates that an information system will be a solution to the problem. In the case where the system analysis is a front end part of an information system design methodology, the work of systems analysis will tend to be wasted if an information system is not required. In the case where the method of system analysis is distinct from the information system design method, both methods will tend to use different tools and have different perspectives; as a result, little of the information gained in system analysis will be used in the design process.

This paper offers another possibility - one in which there is a continuity between the general systems analysis of SSM and the design process. The idea of preserving this continuity is not new. Methods have been developed by Wilson (1984, 1990), and Avison and Wood-Harper's Multiview (1990) but the method used here is radically different. While Wilson and Multiview seek to build upon the stake-holder constructed conceptual models in order to create information systems, the method described below seeks to increase the logical power and content of these models to the point where an information system design can be derived by formal methods.

The paper consists of six main sections. These are concerned with systems analysis, language creation, knowledge elicitation, knowledge representation, codification and verification. Each section begins with a model, then the shortcomings of the model are explained and a remedy suggested, leading to the model in the following section. Each model has a number of uses and need not be merely a stepping stone to the next model. SSM as a tree structure is implicit in this idea. The central trunk is the basic conceptual model and a number of uses branch off; the branches themselves sub-divide. This paper will initially follow the branch of information system design and a sub-division of this concerned with knowledge based systems will be followed to the conclusion.

Systems Analysis
Figure 1 gives an SSM conceptual model of a human activity system. It is taken from Wilson (1984, 1990) but has been simplified to exclude the monitor and control systems that accompany all SSM models.

There are a number of factors that make these SSM models quite different from models to be found in other forms of system analysis. The first is that the models are actually constructed, rather than merely approved, by the stake-holders in the organization concerned. The second factor is that the models are notional. Checkland and Scholes describe them as 'holons'; intellectual constructs rather than systems in the world. The third factor is that the arrows in the models are intended to represent logical contingency (Checkland and Scholes, 1990, p 36). They are not, as many people assume, intended to represent causal connections.

The results of Checkland's case work can generally be described as a change of thinking on the part of the people in the organization. This is quite compatible with the notional status of the models. However, theoretical problems arise when we consider Wilson’s work, which attempts to use these models as a basis for the design of information systems that deal with real-world events. It needs to be explained how models that are only inventions of the mind can tell us anything about the physical world.

It has been argued elsewhere (Gregory, 1993c) that from the perspective of philosophical logic the stakeholders' debate is Wittgensteinian language game in which the finalized conceptual model is a definition of a desirable state of affairs. If this is accepted it opens the way to extend the stake-holder driven process to produce an artificial language that will be capable of being represented in computer code.

Language creation
The vocabulary of the new language will be provided entirely by the stake-holders. We shall take a modal predicate logic as the syntax of the language. This will require an increase in the number of logical connectives used in the model building process. It has been argued previously (Gregory, 1993a) that the model will need to have logical connectives capable of representing causal sequences. There are two reasons. One is that these connectives will be needed later when we come to build the empirical model. The other is that they are needed immediately to express process definitions. The result will be a Logico-linguistic model. Figure 2 shows a Logico-linguistic model that has been built out of figure 1.



Most things are defined by their qualities. Other things are defined by the process of their production. Whiskey is a spirit distilled from fermented malted grain. This means that if you take some grain, malt it, ferment it and then distill it you end up with whiskey – no matter what its taste or other qualities are like (Gregory, 1993c) Many SSM models are process definitions and this has lead some people to think they are causal descriptions.

Causation is explained in terms of necessary and sufficient conditions. In Figure 1 if we say r is logically dependent on q we are saying the same thing as saying q is necessary for r, but this does not mean that q is sufficient for r. The logical way of expressing this is to say that r implies q. In symbols: $$ r \to q $$. If q was a sufficient condition of r the expression would be the reverse i.e. $$ q \to r $$. A necessary and sufficient condition can be expressed by the logical connective known as the “the biconditional” represented by the double headed arrow. The the diagrams a necessary condition will be indicated by a solid arrow, a sufficient condition by a broken arrow and a necessary and sufficient condition by a solid double headed arrow. To these are added the “AND” containing bubble representing conjunction (p and q), and “ANDOR” containing bubble representing inclusive disjunction (p or q or both) and an “OR” containing bubble representing exclusive disjunction p or q but not both). “AND” is represented by the logical symbol “&” “ANDOR” by “v”. The “OR” connective can be expressed as $$ (p \And \neg q) \lor ( \neg p \And q) $$. Figure 2 can be expressed in the propositional calculus as follows:

$$((s \And a \And b) \leftrightarrow t) \And (s \leftrightarrow (u \lor y \lor w))$$

This can be rendered in English as “Patients are discharged if and only if they are alive, have been signed out and have been treated; and Patients will have been treated if and only if they have had surgery, medicine or therapy”.

Two other logical moves are present in Figure 2. One is that the contents of the bubbles have been changed from imperatives (commands) to declaratives (statements or propositions). This conversion is necessary in order that standard logics can be used. It will also be useful when we come to express the model in Prolog, which is a declarative language. The second is the addition of ‘L’ symbols alongside the arrows. These are modal operators. As ((s and a and b) implies t) and (t implies (s and a and b)) standing alone could be either a process definition or a causal description we need a means of indicating which it is. The ‘L’ modal operator indicates that the relation is either a definition or a deduction from a definition. It is paired with the ‘M’ operator which indicates that the relation is factual, logically contingent and not definitional.

Knowledge elicitation
The logico-linguistic model provides a framework that will enable stake-holders to build an empirical model without ambiguity. In the empirical model, putative facts about the real world will be added to the logico-linguistic model.

There are two types of definition: intensive definition and extensive definition. An intensive definition will give a criterion or criteria for class inclusion. An extensive definition will specify the members of the class. Thus, in Figure 2 ‘patient is discharged’ is given an intensive definition. What it says is that anything that fulfills the criterion of being a treated patient and a living patient and a signed out patient is a member of the class of discharged patients, and vice versa. ‘Patient is treated’ is given an extensive definition. The figure states that this class has three member classes (u, y and w) and only three member classes. Therefore, anything that is a member of one or more member classes will be a member of the class of patients treated, and anything that is a member of the class of patients treated must be a member of at least one of the member classes.

It is contended that any term that is given a useful intensive definition will have an empirical extension, and any term that is given a useful extensive definition will have an empirical intension. Knowledge elicitation, therefore, will consist simply of giving the intension for an extensive definition and the extension for an intensive definition. The intension of ‘Patient is treated’ might be that every patient has been attended to by a doctor or nurse who has taken some action that is believed to improve the patient’s health. Empirical intensions are not always particularly useful. Far more important are the empirical extensions of intensive definitions. In Figure 3 we will take the extension of ‘Patient is discharged’ to be the class that comprises the class of patients who return home and the class of patients who are transferred to other institutions. These classes are mutually exclusive in that a member of one cannot be a member of another, as such they are included in an "OR" bubble. This extension is putatively true as a matter of empirical fact not as a matter of definition. It is, therefore, marked with the 'M' modal operator. These empirical counterparts of definitions are known as 'inductive hypotheses'.

The fact that bubbles c and d are linked to bubble t by a double-headed arrow indicates that the stake-holders think that the formula $$ c \lor d $$ forms the full extension of t. In this case we have full knowledge of t. In practice the knowledge of the stake-holders is often insufficient to give the full empirical extension for every intensive definition in the system. In this case there are three possible courses of action. One is to conduct empirical research in order to find the full extension. A second is to build a system with incomplete knowledge; if this is done the system will be logically incomplete and there will be statements that are undecidable -that is, the system will not be able to determine whether they are true or false. A third possibility is for the stake-holders to make an educated guess and hope that the system will detect any errors. A system of non-monotonic logic is introduced below which makes this third possibility viable.

Knowledge representation
Figure 3 can be formally expressed in modal predicate logic as follows:

Domain: people who go to hospital

Sx: x is a patient who is treated

Ax: x is a patient who is alive

Bx: x is a patient who is signed out

Tx: x is a patient who is discharged

Ux: x is a patient who has surgery

Yx: x is a patient who has medicine

Wx: x is a patient who has therapy

Cx: x is a patient who returns home

Dx: x is a patient who is transferred to another institution

Prem. (1) $$L (\forall x) (Tx \leftrightarrow (Sx \And Ax \And Bx))$$

Prem. (2) $$L (\forall x) (Sx \leftrightarrow (Ux \lor Yx \lor Wx))$$

Prem. (3) $$M (\forall x) (Tx \leftrightarrow ((Cx \And \neg Dx) \lor (\neg Cx \And Dx)))$$

Inference: (4)$$ M (\forall x) ((Cx \And \neg Dx) \lor (-Cx \And Dx)) \rightarrow (Ux \lor Yx \lor Wx)$$

Premise (1) can be expressed in English as 'For all x, x is a patient who is discharged if and only if x is a patient who is treated and alive and signed out'. Formula which begin with $$ (\forall x)$$ are known as universals as are the English statements that correspond to them. The formula (4) can be deduced from the three premises; it is shown in figure 3 by the dotted arrow and the solid single-headed arrow. This completes the system of universals, but so far it is only about object variables, in this case 'x'. It say nothing about the real world, not even that anything exists. The real world connection is made when particulars and existential statements are added to the system. We shall not introduce particulars into this system of modal predicate logic: instead, we shall move on to Prolog where the universals shall become 'rules' and particulars will become Prolog 'facts'.

The Prolog model
The horn clauses that form the logical format of all Prolog rules could be derived from the predicate logic given above. As a practical method this would not be very useful. For example, the formal derivation of (4) from the premise given above would require 17 lines. It is easier to look at figure 3 when writing the Prolog. The logic is useful in so far as it shows that a formal specification of the Prolog program is possible. It also highlights some logical problems that Prolog has.

The program given here is written in Turbo Prolog. There are two serious difficulties in converting the logic, or a logical model like figure 3, into Prolog: one is with negation the other is with the biconditional. Prolog is unable to express horn clauses with a negative consequent and some formula in predicate logic can not be expressed in horn clauses with positive consequents. The biconditionals express mutual implication as is indicated by the double headed arrows in figure 3. as it works on the chaining principle Prolog is unable to run a program that contains mutual implication, or any substitute for it, without going into an infinite loop. For example, if we want to say that people are mad if and only if they are insane, then we would normally write it as:

mad (X) if insane (X).

insane (X) if mad (X).

But Prolog cannot handle this. It looks for a value of mad (X) and sees it will have the same value as insane (X); it then goes on to the second line and sees that insane (X) has the same value as mad (X); then it returns to the first line and repeats the process infinitely.

The solution is to give up negation altogether in the program. This can be achieved in the same way in which subtraction is eliminated from commercial accounts by a system of double entry book-keeping. We shall use artificial predicates prefixed by “not_” to express negation. Corresponding to these will be artificial objects also prefixed by “not_”. A positive predicate will always be paired with a negative predicate and a positive object paired with a negative one. Thus if we which to say Jill is mad we will also say that not-Jill is not mad:

mad (jill).

not_mad (not-jill).

The two negatives can be understood as canceling each other out. We can also use this method to specify events that have not happened. For example, if Jack is not insane we can say:

insane (not_jack).

not_insane (jack)

A program for madness and insanity would be:

mad (X) if insane (X).

mad (jill).

not_mad (not_jill).

insane (not_jack).

not_insane (X) if not_mad (X).

not_insane (jack).

The query to determine who is mad i.e. “mad (X)” returns “not_jack” and “jill”; the query “not_insane (X)” returns “not_jill” and “jack”; not_mad (X) returns “not_jill”; insane (X) returns “not_jack”. Thus, we can soon find out who is mad and who isn't.

This procedure would enable us to express the three biconditionals from figure 3 in Prolog. However, it does not produce concise programs. Indeed the page of Prolog below represents only one line of predicate logic – this being formula (4). The last four lines are concerned with verification.

A Prolog program
Clauses

surgery (not_jack).

surgery (X) if not_medicine (X) and not_therapy (X) and returns_home (X).

surgery (X) if not_medicine (X) and not_therapy (X) and another_institution (X).

medicine (not_jill).

medicine (X) if not_surgery (X) and not_therapy (X) and returns_home (X).

medicine (X) if not_surgery (X) and not_therapy (X) and another_institution (X).

therapy (not_jill).

therapy (not_jack).

therapy (X) if not_surgery (X) and not_medicine (X) and returns_home (X).

therapy (X) if not_surgery (X) and not_medicine (X) and another_institution (X).

returns_home (jill).

returns_home (not_jack).

another_institution (not_jill).

another_institution (jack).

not_surgery (jack).

not_medicine (jill).

not_medicine (jack).

not_therapy (jill).

not_therapy (jack).

not_returns_home (jack).

not_returns_home (not_jill).

not another_institution (not_jack).

incorrect_hypothesis

(surgery_if_not_medicine_not_therapy_and_returns_home)

if not_surgery (X) and not_medicine (X) and

not_therapy (X) and returns_home (X).

incorrect__hypothesis

(surgery_if_not_medicine_not_therapy_and_another_institution)

if not_surgery (X) and not_medicine (X) and

not_therapy (X) and another_institution (X).

incorrect_hypothesis

(medicine_if_not__surgery_not_therapy_and_returns__home)

if not_medicine (X) and not_surgery (X) and

not_therapy (X) and returns_horne (X).

incorrect_hypothesis

(medicine_ if_not_surgery_not_therapy_and_another_institution)

if not_medicine (X) and not_surgery (X) and

not_therapy (X) and another_institution (X).

incorrect_hypothesis

(therapy_if_not_surgery_not__medicine_and_returns__home)

if not_therapy (X) and not_surgery (X) and

not_medicine (X) and returns_home (X).

incorrect_hypothesis

(therapy_if_not_surgery_not__medicine_and_another_institution)

if not_therapy (X) and not_surgery (X) and

not_medicine (X) and another_institution (X).

Verification
Validation of the program is not a theoretical problem in this system because the rules can be formally derived from the formulae in predicate logic. Any error will be the result of either mistakes made during the construction of the empirical model or mistakes made in entering particular facts into the program. Errors in both respects can be picked up by the double entry system. For example our program produces:

Goal: surgery (X)

X = not_jack

X = jill

X = jack

This says that Jack has and has not had surgery. This could have been a result of a mistake at a data entry level but in this case it is not. The mistake is in the empirical model. The last three lines of the program are designed to detect these errors. The incorrect_hypothesis predicate picks up inductive hypotheses that have been falsified by particular facts:

Goal: incorrect_hypothesis (X)

X = surgery__if_not__medicine_not_therapy_and another_institution

X = medicine_if_not_surgery_not_therapy_and_another__institution

X = therapy if_not_surgery_not_medicine_and_another_institution

Jack has not had surgery, medicine or therapy; he has not returned home but he has been transferred to another institution. The formula:

(4)$$ M (\forall x) ((Cx \And \neg Dx) \lor (-Cx \And Dx)) \rightarrow (Ux \lor Yx \lor Wx)$$

which represents the broken arrow and the single-headed arrow in Figure 3, is therefore, incorrect. It follows from this that one of the three premises (1), (2) or (3) must be incorrect. As premises (1) and (2) are logically true it must be premise (3), the one with the M modal operator, that is false. In simple terms, the hypothesis that all patients who return home or are transferred to another institution are discharged patients, has been falsified by a particular event. This event is Jack being transferred to another institution without having had surgery, medicine or therapy. The Prolog program has been configured in such a way that the entry of data about Jack has enabled us to detect this. This is a form of non-monotonic logic; the program has learned that one of its premises is false.

The benefits of the earlier SSM work can now be seen. The modal distinctions were made using SSM and, without the modal distinctions, we would not be able to determine which of the three biconditionals in Figure 3 is false. Without the modal distinctions, all three biconditionals would have the same status. If they all had the status of inductive hypotheses then the fact that Jack has been transferred to another institution without having surgery, medicine or therapy could be equally well explained by 'all discharged patients are treated, alive and signed out' being false or by 'all treated patients have surgery, medicine or therapy' being false. If they all had the status of logical truth the situation would be even more unsatisfactory.

Consider what would happen if the three biconditionals had the status of logical truth. If this were the case the system would only accept those empirical particulars that are consistent with its in-built logical configuration. All other particulars would be rejected. From this it follows if $$(\forall x) (Tx \leftrightarrow ((Cx \And \neg Dx) \lor (\neg Cx \And Dx)))$$ were a logical truth, then before we could establish that Jack has been transferred to another institution we would have to establish that he had not returned home and that he has been discharged. To establish that he has been discharged we would have to establish that he had treatment and to do this we would have to establish that he has had surgery, medicine or therapy. In other words, to establish that Jack has been transferred we must first establish that Jack has had surgery, medicine or therapy. We need to do this because having surgery, medicine or therapy is part of the extended definition of a patient who has been transferred. However, in this case the model does not enable us to infer anything new about Jack at all. So, what use is the model.

Verification, unlike validation, is only possible if part of the system is open to falsification. The hypotheses in the system will be verified with the addition of each particular fact that does not falsify them. Systems that cannot be verified, even if they can be validated, cannot in themselves refer to real-world objects and events. Real-world events are contingent and, therefore, any statement about the real world must also be contingent. Systems that do not contain contingent elements will only map on to these real-world events if they want to and, as such, do not really map onto the real world at all (Gregory, 1993b).. Inductive hypotheses form an indispensable buffer between definitions and real-world particular facts.

Conclusions
Checkland & Scholes (1990) claim that an information system “...will always have to include the attribution of meaning, which is a uniquely human act. An information system, in the exact sense of the phrase, will consist of both data manipulation, which machines can do, and the transformation of data into information by the attribution of meaning.”

Traditional methods of information system design, such as SSADM and Information Engineering, produce systems in which tacitly all the universals are incorrigible. This prevents them from making statements that refer to the real world. It is up to the human operators to determine reference. If the operators do this, they will do so by formulating inductive hypotheses as well by observing particular facts.

This paper has provided a description of the logic that would be required for the creation of a computer system that will be able to make statements in a language belonging to clients and users. In such a computer system the attribution of meaning would, to a large extent, cease to be a uniquely human act and the computer system would become a true information system in Checkland & Scholes' sense of the word.