Abstract: Rule Based Programming, a declarative programming paradigm, is based on logical patterns to select data and associate it with processing instructions. This is a more indirect method than the sequential execution steps of an imperative programming language.
Welcome to the 205th issue of The Java(tm) Specialists' Newsletter. I would like to thank Dr Wolfgang Laun for educating us about rule-based programming in Java. Wolfgang has been using Java for many years and has found a nice way to marry our language with rules. Before we hand over to our friend, Dr Laun, a quick anecdote. I was presenting my masters course in Austria for his team last March. At the end of the training, the administrator asked him a question and addressed him as "Herr Laun". At the same time, Wolfgang asked me for my name tag, by saying "dogtag bitte". But I understood him as "Doktor bitte" (doctor please), so I thought he was telling the administrator to please address him with his title, not just the plain "Herr". This was quite out of character for him, as Wolfgang is a very humble person, so I just stood there, staring at him. He repeated his request of "dogtag bitte", whilst looking at me, obviously wondering why I was not responding. After some more requests, I eventually figured out that he was talking to me.
Over to Dogtag Laun, I hope you will enjoy his writing. At the end he extends an invitation to participate in a rules conference if you are interested in the subject.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
There are several projects extending Java towards programming paradigms other than procedural and object-oriented. Scala, for instance, adds functional programming while retaining object-orientedness and full integration with Java. Others head for what is called rule-based programming, one of the several flavours of logic programming. As Heinz was kind enough to invite me to present a digression into this technique I'll guide you through a short tour of the hallmarks of one of these systems, highlighting principles rather than technical details.
Rule-based systems (RBS) have evolved from developments in the area of artificial intelligence, as the foundation for expert systems, but they have outgrown their heritage long since, and gained a very strong foothold in many lines of business and technical applications.
The sample rules presented below are written according to the RBS Drools, an open source project started more than ten years ago, and still going strong. Its architecture adheres to the typical composition: an Inference Engine (IE) is driven by the rules in the rule base (RB) supplied by the programmer, applying them to the data in a storage area known as Working Memory (WM). This triad lies on the bed of a Java application, which is responsible for loading the RB with the appropriate rules, starting a "session" with its WM, supplying Java objects to the session and kick-starting the IE into its operating cycle.
One thing the IE does in its operating cycle is the selection of objects from WM, which are called "facts" to distinguish them from all other objects. This selection is based on the condition part of a rule, as shown here:
rule "Hello Dolly" when Person( firstName == "Dolly" ) then //... end
As you might guess, the condition is the part between the
keywords "when" and "then", selecting a fact of class
Person
with its field firstName
containing "Dolly".
Two remarks are indicated:
==
used in the
comparison is a blunder: it's merely a convenience provided
by the rule compiler, which converts it to a call to
String.equals.
(The compiler is clever enough
to leave the operator in places where it is suitable.)
As you might expect, these Patterns may contain more than a simple comparison. Arbitrary constraints on a fact's fields may be combined in a list, implying a conjunction:
rule "Allo mon vieux" when Person( gender == Gender.MALE, age >= 80, country == "France" ) then //... end
Fact selection would be rather pointless without the capability of doing something with the retrieved objects. This is the domain of the consequence of a rule, written between "then" and "end". Here's the last rule, completed with a simple output statement:
rule "Allo mon vieux" when Person( gender == Gender.MALE, age >= 80, country == "France" ) then System.out.println( "Allo mon vieux" ); end
A consequence is simply written in Java, with a few
extensions, as we shall see in a minute. As you'd expect,
this code is executed by the IE once a matching
Person
object is found.
We can access located facts in the consequence by adding binding variables to facts and fields. If the requirement is to add the person's name to the salutation and log this action, we can write:
rule "Allo mon vieux X" when $person: Person( $name: name, gender == Gender.MALE, age >= 80, country == "France" ) then System.out.println( "Allo mon vieux " + $name ); SysLog.log( "Saluted " + $person ); end
It would be too rash for you to be disappointed after concluding that a rule is merely a strange way of coding something that might be expressed by a simple Java if statement. Quite the contrary: a rule is something fundamentally different. For one thing, each rule applies to all matches that can be found in WM, i.e., the IE acts rather according to this Java snippet:
for (Object fact : workingMemory) { if (fact instanceof Person) { Person person = (Person)fact; if (person.getGender() == Gender.MALE && person.getAge() >= 80 && person.getCountry().equals("France")) { System.out.println("Allo mon vieux " + person.getName()); SysLog.log("Saluted " + $person); } } }
And there is an even more distinctive gap between rule and if statement, which is illustrated by the following rule, where "code" fields contain IATA airport codes:
rule "Flights between two cities" when Airport( $depCode: code, city == "London" ) Airport( $arrCode: code, city == "Paris" ) $flight: Flight( departureCode == $depCode, arrivalCode == $arrCode ) then System.out.println( $flight ); end
This example is suggestive of the full matching power of an IE. Facts are located according to the constraint conditions, so that the first Airport pattern matches airports such as "LHR", "LGW" or "LCY", and the second one is bound to "CDG", "ORY", "LBG", and so on. Finally, the third pattern results in a selection of any flight from London to Paris, and then the consequence may execute for yet another triple of facts.
Comparable Java code might use three loops, and be rather inefficient if the full range of all objects were iterated or require some coding effort if maps were used for efficiency.
But isn't the IE required to perform much the same work to produce the outcome? A condition may have any number of patterns, with many matches per pattern and, consequently, even more combinations of two, three and more matching facts: the math is somewhat frightening. One solution to the many pattern with many object pattern match problem is known as the Rete algorithm, developed by Dr. Charles L. Forgy and first published in 1974. (Dr. Forgy adopted the term "Rete" - the latin word for "net" - because of its use in anatomy to describe a network of blood vessels and nerve fibers, resembling a diagram of the algorithm's fundamental data structures.)
A detailed discussion of the Rete algorithm is out of scope, but its basic concept is to sacrifice memory for speed. In a network compiled from the rules' patterns, references to facts are passed on to positions corresponding to patterns and the "joins" of patterns as long as conditions match or there is at least one accumulated tuple of facts to be joined with. A terminal node represents a rule. Most importantly, this approach permits factoring of alike patterns into a single network node, thereby reducing the evaluation effort. Further improvement is made by indexing, permitting fast access to facts whenever a field is tested for equality with a value.
So far I haven't said anything about the way objects are made into facts, and what may happen to them in WM except being selected for their participation in the "firing" of a rule, as the execution of its consequence is picturesquely called.
The span of an object's existence as a fact begins when it is
inserted, which can be done both in the embedding Java code
and from a rule's consequence. The IE will immediately
proceed to advance the new fact through the Rete network,
which may cause any number of rules to become ready to fire.
In the following example, the participating
Request
fact has fulfilled its mission, and
therefore it is retracted in the rule's consequence:
rule "Cheapest flights between two cities" when $request: RequestCheapestFlight( $fromCity: fromCity, $toCity: toCity ) Airport( $depCode: code, city == $fromCity ) Airport( $arrCode: code, city == $toCity ) $flight: Flight( departureCode == $depCode, arrivalCode == $arrCode, $price: price ) not Flight( departureCode == $depCode, arrivalCode == $arrCode, price < $price ) then System.out.println( $flight ); retract( $request ); end
In addition to the retract
method which is
implicitly taken to refer to the WM where this happens, the
rule also introduces another conditional element, i.e., the
one initiated with "not". This is the equivalent of the
negated existential quantifier from predicate logic,
denoted by ∄. The result of the conditional element is
a simple boolean value: true, if no fact matching the
following pattern exists, and false otherwise. It does not
contribute another fact to the accumulated tuple
(consisting of a RequestCheapestFlight
, two
Airport
facts and a Flight
fact; it
merely acts as a guard, inhibiting the firing of the rule
except for the cheapest flight being the fourth fact in the
assembled tuple.
A fact may also be destined to have some of its fields being
modified when a rule fires. The next rule simply checks
whether a Flight
fact contains existing airport
codes and sets the "checked" flag in the fact:
// Version 1 rule "Check flight data" when $flight: Flight( $depCode: departureCode, $arrCode: arrivalCode ) Airport( code == $depCode ) Airport( code == $arrCode ) then modify( $flight ){ setChecked( true ) } end
The modify statement is another extension available for coding consequences. It applies one or more method calls, enclosed in braces, to the object given in the parenthesis after "modify". But wouldn't the simple method call
$flight.setChecked( true );
achieve the same thing? The answer to this question exhibits another fundamental principle for RBS that are reactive to fact changes. While the straightforward setter call changes the object representing the fact, the IE is not aware of the effected change. And without being informed about a change, the IE will not reassess the fact's position in the Rete network, which might duly cause other rules to fire.
The principle of being reactive to change comprises insertions, modifications and retractions. Ensuing reevaluation creates new activations, which are queued for firing. Once fired, the activation is gone, otherwise rules would fire infinitely, which would be witless. Note, however, that a pending activation might very well be deactivated if some change falsifies one of the corresponding rule's conditions. Although this sounds dangerous, it's rarely a problem in practice.
You may have wondered why I added the comment "Version 1" to
the preceding rule, and perhaps you have guessed it by now.
The principle of being reactive to change does not exclude
the rule causing the change. Oops! Rule "Check flight
data" announces a modification of one of the participating
facts (the Flight
) and so it is subject to
reevaluation. All conditions are still fulfilled and so we
promptly receive another activation, the rule fires again,
and again, and again. A solid remedy is to provide a more
exact definition for the firing of the rule:
// Version 2 - corrected rule "Check flight data" when $flight: Flight( $depCode: departureCode, $arrCode: arrivalCode, checked == false ) Airport( code == $depCode ) Airport( code == $arrCode ) then modify( $flight ){ setChecked( true ) } end
Several features had to be left out for brevity's sake. One of them is the (unnegated) existential quantifier. Its usefulness can be seen from the following rule, a variant of "Flights between two cities":
rule "Are two cities connected" when Airport( $depCode: code, city == "Reykjavik" ) Airport( $arrCode: code, city == "Vienna" ) exists Flight( departureCode == $depCode, arrivalCode == $arrCode ) then System.out.println( "Direct flight from Reykjavik to Vienna available" ); end
Without "exists", the rule would fire for each flight. The quantifier constitutes a simple boolean guard, activating the rule once, and only once, independent from the actual number of flights.
Patterns may not only be written using class names of fact objects: it's equally possible to use interface and abstract class names. This significantly reduces the number of rules whenever it is possible to formulate constraints in terms of properties of the abstract class or the interface.
Regrettably, the rather deadpan examples do not let on about the potential fun that can be had while writing rules. Cracking the famous Zebra Puzzle or any of its variations, implementing heuristic rules for solving Sudoku or hopping from vertex to vertex as in Dijkstra's Algorithm - these are just a few of the pastimes to be had.
Finally, and to be fair, I ought to mention that, besides Drools, there are several other RBSes in Javaland, e.g. OpenRules and SMARTS, to name some of them.
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.