^{1, 2}

^{1}

^{1}

^{2}

We present a DNA-based implementation of reaction system with molecules encoding elements of the propositional logic, that is, propositions and formulas. The protocol can perform inference steps using, for example,

Deoxyribonucleic acid (DNA) computing is the computational paradigm which uses organic molecules instead of traditional computer technologies to store and manipulate data. It is an interdisciplinary crossroad of biotechnology, nanotechnology, and computer science, based on manipulations with DNA strands in special laboratory conditions. The biggest advantage is the massive parallelism of reactions which led us to making trillions of similar calculations at the same moment [

The idea was first implemented in 1994 by L.M. Adleman, a computer scientist from the University of Southern California. He presented a concept of how to solve in that way a well-known NP-complete problem HPP—how to find a Hamiltonian path in a graph. Nodes and edges of the graph were encoded by special single-stranded DNA molecules and then mixed in a test tube. All possible paths were created during the reaction and then only the Hamiltonian paths were filtered out by standard laboratory steps. The experiment was tested in laboratory for a 7-node graph [

After Adleman’s work, many following ideas of solving computational problems by DNA were presented. There were ideas about how to solve another NP-complete problem called SAT [

The main advantage of our protocol is the possibility to implement logic axioms and laws of natural deduction, which was impossible in previous implementations [

Elementary logic variables are simply called

The conjunctive normal form (CNF) is a way of simplifying logical formulae to be a conjunction of clauses, where a clause is a disjunction of literals. For example the following formulae are in CNF:

To convert every formula written in CNF to our special form, where we want to replace disjunctions by implications, we need some more laws:

When the system has every formula rewritten in that way, it can autonomously process it and deduce new facts and formulas using

DNA (deoxyribonucleic acid) is a long polymer made from repeating units. Those units are called nucleotides. They are composed of sugars (deoxyribose), phosphate groups, and nucleobase attached to the sugars. They differ from each other only by the last part and there are four possibilities: adenine, cytosine, guanine, and thymine, abbreviated using the letters A, C, G, T. Most DNA molecules are double-stranded helices consisting of two long polymers. These strands bind in opposite directions to each other. One end has 5′-OH group whilst the second one has 3′-OH group. The bonds are subject to the Watson-Crick complementarity rule: adenine is always connecting with thymine by double hydrogen bonding and cytosine with guanine by triple hydrogen bonding. Basic laboratory operations on DNA, which are mostly utilized for DNA computing, are as follows.

Annealing and ligation.

The DNA implementation of logic operations described in this paper depends heavily on the iterated operations of DNA ligation and cutting by a restriction enzyme and the combination of both. These operations have been previously used in models of DNA computing as sticker systems, ins-del systems, splicing systems, and many more; see, for example, [

Notice that both sites have the common pair of nucleotides CG in the middle. Let us furthermore consider two DNA molecules containing these sites. They can both be cut by their respective enzymes, and the four resulting molecules with sticky ends can then crossover anneal, producing two new molecules. Graphically, the operation of

The operation of splicing.

Formally, this operation is defined using the formal language framework. Let

It is known that the operation of splicing is powerful and that the splicing systems with sets of splicing rules can generate all recursively enumerable languages under various restrictions. However, the use of splicing to implement logic operations is not mentioned in the relevant literature as [

The implementation which we are going to present is based on a splicing system which was already explained. The ligase is utilized to catalyse the annealing of complementary parts of molecules, on one hand. On the other hand, we use the restriction enzyme named Bse

The presented system is fully autonomous which means that a human assistance is needed only to prepare constituents of reaction, to mix them in a test tube, and to read an answer by the electrophoresis after all the reactions have taken part. There is no need to add or remove any substances during the reaction. The restriction enzyme has to be added just once and it autonomously finds molecules which have to be cut.

Logic variables and their values are encoded by unique sequences of 4 nucleotides. Single-stranded sequences assigned to the same variable with different values (

To make examples easier to understand, sequences will be presented in the following way:

These unique 4-nucleotide sequences will be used not only for terms but also as a part of conditional rules or questions asked to the system.

Molecules representing terms share a common starting sequence and a constant length. They also contain the part recognised by Bse

There is one special molecule which is utilized in every reaction; it is called the

All the molecules encoding formulas in a test tube are assumed to exist in conjunction. Therefore, the existence of both values of the same variable at the same reaction (e.g.,

The reaction of inconsistency.

As a result we get a molecule with the length of 104 nucleotides in each strand. After terminating reactions, it is possible to read the length of every molecule by the electrophoresis. If a molecule with this length occurs in the test tube, it means that an inconsistency occurred during the reactions. In that case the set of encoded formulae is unsatisfiable, and any formula can be potentially derived from it by the deduction laws.

Observe that the enzyme Bse

Possible artifact molecules.

The first of these molecules has the same sticky ends as the terminal molecules and, therefore, it can compete with them in the above reactions. However, since we assume an abundant amount of each species of molecules is present during the reactions; this will not prevent the correct reactions to take place. The second artifact molecule has self-complement sticky ends and hence it can eventually iterate itself. However, it cannot interfere with other programmed reactions.

The simplest inference step means just one fact in antecedent and just one fact in consequent of an implication. The contraposition rule implies that in one conditional there are two possibilities of inference; for example,

The first sticky end is always complementary to the antecedent and the second one is identical to the consequent. It is important to know that if we rotate view of this molecule we obtain the molecule representing contraposition conditional (without changing the orientation of DNA strands):

If we mix in one test tube molecules representing

The reaction of inference for

As a result we get the molecule representing the fact

Inference rules containing conjunction can be implemented as a conditional with conjunction in any part of the conditional: antecedent and consequent.

The conjunction in consequent has to be divided into two simple conditionals. As it was mentioned in previous explanation of mathematical laws, for example,

The conjunction in antecedent requires a construction of a new molecule which extends the one called simple conditional. It implements the inference using the first de Morgan’s law and the rule of contraposition. For the conditional

When a more complex conjunction has to be considered, that is,

If we mix in one test tube molecules representing

The reaction of inference for

As a result we get the molecule representing the fact

Assume that, instead of

The reaction of inference for

As a result we get the molecule representing the conditional

The disjunction also can be implemented in both parts of conditional: antecedent and consequent.

The disjunction in antecedent has to be divided into two simple conditionals, in the similar way like conjunction in consequent; for example

The consequent part needs a construction of a new molecule, similar to the one present in conjunction description. It saves proper inference using second de Morgan’s law and the rule of contraposition. For conditional

Examples of the corresponding reactions are similar to those presented in the section concerning conjunction. If we mix in one test tube the molecules representing

Our reaction system can be asked questions like “is it possible to deduce a certain value of a given fact starting from the conditions which we know?" If we ask a question whether

Every question contains a sticky end identifying variable (the complementary part) and a unique 4-nucleotide sequence identifying question (which has to belong to the class of sequences complementary with themselves, excluding 3′-

If we mix in one test tube molecules representing

The reaction for a and the question

As a result we get a molecule with the length of 504 base pairs (bp). This length means a positive answer for the question

In this subsection we show that (a) if a positive answer to a question is deducible from the initial set of formulas, then there are reactions which would eventually produce the corresponding molecule of length 504 bp, and (b) if the system gives a positive answer, then it is truly deducible from the initial set of formulas. We can take for granted three simple reactions (based on splicing and possible only when some molecules have complementary sticky ends): the reaction of inconsistency (presented in Section

In the mathematical way, let

We define the formula of inconsistency as

For the proof we treat sets

Let

(

~(

Let us assume that the system gives the correct answer for a given set

Further possible implication schemes (up to 4 symbols) are

Let us denote completeness of inference by C(

For the proof of soundness, let us assume that the molecule representing positive answer emerged during the reactions in our test tube. One can observe the following.

When the molecule representing positive answer was produced in final test tube, it means that the molecules representing a basic fact and a question with mutually complementary sticky ends existed before. Only the molecules representing basic terms have restriction enzyme Bse

If the molecule representing this basic term did not exist in

There are no more possibilities of creating the molecule representing the positive answer because only the molecules representing facts (even received during some inferences) have enzyme Bse

Due to the above given reasons, we deem our system as sound and complete. Using the laws of classical logic, it can run every possible inference and answer any question connected with it.

The experiment steps in laboratory can be briefly described as follows:

encoding every clause to DNA molecules,

mixing all the molecules in one test tube and leting all the possible ways of resolution be done automatically: looking for inconsistency; expanding the knowledge about facts and its value by

filtering the result using (gel) electrophoresis: checking if the molecule signalizing inconsistency was created during the reaction; otherwise checking if some of the molecules connected with prepared questions were produced. The process ends here.

Steps 1 and 3 are constant time operations which means time complexity

The space complexity is regarded as an asymptotic number of different DNA molecules (note that for proper reaction, each of them has to be present in many copies). Every fact, question, and simple implication needs exactly one molecule. For more complicated implications (using more than one literal in antecedent and/or more than one literal in consequent) it is better to prepare

In this paper we introduced a new DNA inference system based on the classical idea of splicing. It works with any formulae presented in special normal form which uses negation, conjunction, and implication. According to the laws of classical logic, every other formula can be transformed to such a form. The actual goal was to show that it is possible to run logical inference by DNA with two possible values of facts (

Implementation of the most important laws of classical logic, which are necessary in connection with negation, was also presented so that the system is capable of any sequence of natural deduction steps. The presented model utilizes restriction enzyme Bse

In conclusion, given that elementary reactions of splicing were laboratory-verified in [

The authors declare that there is no conflict of interests regarding the publication of this paper.

This work was supported by the European Regional Development Fund in the IT4Innovations Centre of Excellence Project (CZ.1.05/1.1.00/02.0070).