Minimal history, a theory of plausible explanation

 

John E. Mayfield

Department of Genetics, Development, and Cell Biology and Interdepartmental Graduate Program in Bioinformatics and Computational Biology
2106 Molecular Biology Building, Iowa State University, Ames, Iowa 50011, USA

 

Telephone       515-294-6847

FAX                515-294-6669

e-mail              jemayf@iastate.edu

 

Non-technical summary:

This paper offers a new way to define object complexity.  Minimal history is a comprehensive measure that characterizes all of the information required for the creation of an object as well as all computable internal relationships that are present.  Objects with a long minimal history are characterized by a slow growth property, meaning that they cannot be created rapidly from simple inputs.


Minimal history, a theory of plausible explanation

 

John E. Mayfield

Department of Genetics, Development, and Cell Biology and Interdepartmental Graduate Program in Bioinformatics and Computational Biology
2106 Molecular Biology Building, Iowa State University, Ames, Iowa 50011, USA

 

 

Abstract

In computational theory, time is defined in terms of steps, and steps are defined by the computational process.  Because steps can be described, a computation can be recorded as a binary string.  This allows time to be measured in bits, which in turn, allows the definition of various computable complexity measures that account for the minimal amount of computation required to create an object from primitive beginnings.  Three such measures are introduced in this paper.  They are: "transcript depth", which is closely relate to logical and computational depth, "Kd complexity" which is similar to Levin's Kt complexity, and "minimal history".  The later two measures are comprehensive in the sense that they characterize all information required for the creation of an object and also all computable internal relationships and redundancies that are present in the object.

 

 

Introduction

It has long been recognized that object complexity can be characterized either in terms of state description or in terms of a process for creating the object [1].  State descriptions (descriptions of objects in isolation) do not involve time, they are like photographs, and they are concerned simply with objects at a single moment.  State descriptions are associated with observation.  Process descriptions characterize objects in terms of activities that occur over time and space.  A process can be viewed as a sequence of states over time that includes the state of interest. 

 

Some aspects of structure can only be understood and quantified by taking into account processes which are capable of producing the structure of interest.  One such structural property is “slow growth.”  An object characterized by slow growth is one that is difficult to create from simple inputs.

 

Bennett [2] defined the "logical depth" of binary objects in order to formalize the "slow growth" property of some digital objects.  Deep objects can only be created quickly from other deep objects, while the creation of a deep object from shallow input must take extensive computation (many steps).  Logical depth was defined as the minimal number of steps required for a universal [Turing] machine to output the object from nearly minimal input.  Juedes, et al [3] introduced the term "computational depth" and placed the concept on firm mathematical footing.  The slow growth property is characteristic of many objects in human experience, including living organisms and most cultural artifacts, yet depth is not widely recognized as a centrally important property of objects.  There are various reasons for this.  One is that logical and computational depth are formally incomputable, another is that depth has units of time and so does not seem like a complexity measure, another is that the formal definition includes a “fudge factor” that introduces uncertainty, another is that logical and computational depth rely on the Turing notion of discrete sequential computation, and finally, standard physics does not define a characteristic property of objects based on the minimal time required for their formation. 

 

This paper addresses several of these difficulties while remaining faithful to the general framework that underlies the definition of logical depth.  New measures are defined with properties much like logical depth, but which are formally computable, are measured in units of information, and which require no fudge factor.  The definitions require that time be measured in bits, and are based on the concept that the most plausible process for creating an object from nothing is the minimal process that yields the object in the absence of initial knowledge of either the process or the object. 

 

 

Comprehensive Measures

Process is studied extensively in both physics and computer science, but the outcomes of processes are less well understood.  In particular, there are no generally agreed upon ways of characterizing relationships between parts. 

 

We currently have two widely useful (and compatible) measures of object complexity.  Both are said to measure the amount of information present in an object.  The first is based on membership in an ensemble of possibilities.  The entropy of an ensemble is defined as H = Spilog2pi, where pi is the probability of the ith possible state (note that this definition yields negative entropy values to be consistent with the common convention that entropy and information have opposite signs).  When an object consists of a single state drawn from an ensemble of possible (but not realized states), then the information “content” of that object is equal to the negative of the entropy.  When an object consists of a subset of system states, then a convenient expression for information is provided by
I = Hobj - Hsys, where Hobj is the entropy of the object ensemble and Hsys is the entropy of the system ensemble.  I (information) measures the improbability of an object in relation to the various potential states of a system.  This notion of information dates back 100 years to Boltzmann and is usually referred to as Shannon information.

 

The second measure of object complexity is based on the theory of computable functions.  The idea here is that any discrete object can be created by the action of discrete instructions on discrete input.  This action can be simply represented as
Mr(p) = x, where Mr is a computing machine designed to carry out instructions r, p is input to the machine, and x is output.  For simplicity, r, p, and x are treated as binary strings.  The machine serves as a function that matches strings with strings.  Because the computation of x is also a description of x, the shortest input that results in output x must also be a minimal description of x.  We call this measure the algorithmic information, or Kolmogorov-Chaitin complexity, of x and designate the quantity as K(x).  Clearly K(x) depends on the machine used, but if the machine is universal, then for any two machines r and s, the relationship Kr(x) £ Ks(x) + c, where c is independent of x, guarantees that for large values of K(x), the details of machine selection are unimportant [4].  That K(x) is an entropy can be seen by recognizing that the relevant state probabilities have simply been redefined as 2-K(x) instead of 2-n (where n is the length of x). 

 

Neither of these information measures accounts for the difficulty of creating an object.  Difficult to create objects are characterized by extensive interrelationships between parts.  Efforts to characterize this aspect of object complexity fall into two categories.  One approach is to view processes as successions of states related through conditional probabilities [5, 6], and the other is to view processes as computations and to characterize them in terms of the elementary computer steps required to create them [2, 3].  Both approaches associate the term “depth” with the resulting complexity measure.  This paper follows in the tradition of Bennett.

 

 

Minimal History:

The following theory is based on two observations.  First it is possible to record a computation in binary by recording all the steps and second, a computer is simply the embodiment of an ordered set of rules.  A consequence of these observations is that all aspects of a computation, the input, the computer [rules], the output, and the computation itself can be represented as binary strings.  We will also be guided by the observation that some combination of random bit generation and deterministic computation provides the most efficient way to create an [binary] object.  This insight also underlies the concept of algorithmic complexity [4, 7]. 

 

Because there are many possible inputs and corresponding computations that would produce the same output, only minimal inputs and minimal computations are privileged properties of objects.  Because it is not possible to simultaneously and individually minimize the rule-set, the input, and the length of the computation without additional criteria, we employ a specific three step-process.  First, imagine that a machine (set of rules) is randomly created, second input is randomly created, and third the input is acted upon deterministically according to the rule set [sometimes] yielding output. 

 

Since nothing is initially known, we assume that random [binary] choices are responsible for creating rules and input for the rules to act upon.  For simplicity of analysis, let us assume that the rules are generated first, then the input, and only then is the input acted upon according to the rules to produce output.  It is not necessary for all the rules and all the input to be generated before any computation is performed, but it is easier to conceptualize if we keep the three parts separate. 

 

The formal creation of [binary] object x from nothing begins with a process that randomly generates two strings, r and p.  String r constitutes rules for the manipulation of p (p may include additional instructions).  Most strings r do not constitute valid rule sets and will not enable any action.  Random generation of strings r and p, if systematically repeated, will eventually yield a pair of strings with the property that if p were to be acted on according to r, x would be produced in steps described by another string d.  String d is a recording of all the [binary] decisions required to carry out the deterministic computation.

 

Following this scheme, the length of a process for generating x from nothing grows exponentially with the length of r and p and linearly with the length of d.  If all aspects of the process are measured in bits, the length of the shortest description is the minimal number of binary decisions (whether randomly made or not) required to create x.  Our goal is to characterize that process which requires the least number of such decisions.  The process and its minimization are formalized as follows:

 

There are 2½r½ binary strings of length ½r½, and ½r½2½r½ characterizes the expected number of random choices [tosses of an ideal coin, for example] needed to generate a particular string r if there is no a priori knowledge of r; and it requires of order ½rp½2½rp½ choices to randomly create a particular pair of strings r and p.

It follows that G = ½rpd½2½rp½ characterizes a creation of x, and that min G characterizes the minimal creation of x in the absence of prior information. 

We define h = rpd to be a history of x.

And let h' = r'p'd' be a minimal history of x.  It is a binary string that records all aspects of a minimal computation compatible with the minimal creation.  There may be more than one such computation.

The length of the minimal history of x is formally defined as:

|h'| = |r'p'd'| = min{|rpd|½G = |rpd|2|rp| is minimal and r(p) = x}                     [1]

It measures the minimal non-deterministic process required to create x from nothing.

 

Note that according to “equation (1),” incrementation of either |r| or |p| is equivalent in terms of their contribution to G and |h'|.  This reflects the indistinct boundary between computer and program.  Rules can be encoded either in the design of the computer or in the program utilized by the computer. 

 

Min G characterizes the minimal process for creating x from nothing, while |h'| characterizes the minimal computation employed by the minimal process.  Levin [8] defined a measure of complexity, Kt(x) = min{½p½ + log t} to aid in his proof of a time bound on inverting problems.  Defining Kd = log{min G} = min{½rp½ + log½rpd½} = ½r'p'½ + log½h'½ creates a measure similar to Levin's but in which |h'| plays the role of time. 

 

The issue arises as to whether or not the theory is machine and language independent.  Clearly, if one wishes to determine values for ½r½, ½d½, and ½h½, one will need to fix a formalization in which to express the rules, r, and a code for the steps.  Conceptually, however, the theory is independent of any particular language or of any particular machine design.  Fundamentally, |h'| measures the minimal number of binary choices necessary to characterize the creation of a binary object from nothing.

 

Comments on h', d', p', and r'

 

Comment 1:  ½h'½, ½p'½, ½d'½ and Kd do not [generally] characterize the shortest computation of x. 

"Copy x" is generally the shortest computation of x; but ½h'½ and Kd only measure "copy x" when x is contains little redundancy.  Likewise, ½p'½ does not generally measure the shortest input to a computation that outputs x, and |d'| does not generally measure the shortest computation.  ½h'½ ½p'½, ½d'½, and Kd represent compromises balancing the inherently slow process of finding objects by random search against the fast but potentially long process of deterministically computing x.

 

Comment 2:  h', Kd, d', p', and r' are computable (for a specified language).

Because there is always a least one finite computation of the form “copy x” that outputs x and halts, a well structured search of possible rules and inputs starting with the smallest and systematically increasing input and computation length, must yield in finite time the computation that satisfies [1].  The upper bound at any stage of such a search is established by G = ½rpd½2½rp½.  For non-halting computations, d is undefined, but because unfinished computations are left unresolved, the search does not founder on the halting problem.  If there is more than one computation that satisfies the minimum G upper bound, then the minimum ½rpd½ condition establishes ½h'½.

 

Comment 3: ½d'½ and ½h'½ are closely related to logical depth.

Transcript depth, ½d'½, is very similar to logical (or computational) depth.  The differences being that ½d'½ has units of information rather than steps and the length of the input that results in d is ½p'½ rather than K(x) + s (where s is Bennett’s significance factor).  The coding employed weights the elementary computer steps used in the computation.  For logical depth, all steps are weighted equally.

 

½h'½ provides measure of the most plausible explanation for x.  It combines measure of the information required and the time required to create x by the most efficient method not requiring any prior knowledge.  Because for deep objects, ½d'½ >> ½r'p'½, for such objects, ½d'½ » ½h'½.

 

Comment 4: Bennett's significance factor does not appear.

Computer r' is not necessarily universal, but if r' does describe the universal machine used to define logical depth, then the optimal significance factor is given by s = |p'| - K(x).  The optimization specified by equation [1] eliminates concerns about slightly shorter but much slower computations for producing x.

 

Comment 5: ½d'½ and ½h'½ have well-defined upper limits.

Allowing double primes to indicate the [minimal] computation “print x”, the following relationship limits ½d'½:

 

½r"p"d"½2½r"p"½ ³ ½r'p'd'½2½r'p'½

 

Recognizing that p" » x, substitution and rearrangement leads to:

 

½d'½ £ ½ r"x d"½2½r"x½/½r'p'½2½r'p'½.

and

 

½h'½ = ½r'p'd'½ £ ½ r"x d"½2½r"x½/2½r'p'½.

 

From these relationships we see that ½d'½ and ½h'½ are always bounded from above.  Recognizing that ½d"½is of the same order as ½x½ and assuming that ½r"½ » ½r'½, we see that the upper bound for both ½d'½ and ½h'½ grow approximately as ½x½2½x-p'½.

 

Comment 6:  ½h'½ and ½d'½ have the slow growth property.

Slow growth formalizes the observation that some discrete objects are harder to compute from simple input than others [2].  Hard-to-compute objects can only be computed easily (rapidly) from hard-to-compute input and take much computation (require a slow process) to create from easy-to-compute input.  Bennett coined the term “depth” to quantify the difficulty of object computation.  Since ½h½ provides measure of historical activity, ½h'½ has the property that when the minimal history of a discrete object is much greater than the minimal history of the input used to create that object, then the computation must be long (the amount of computational history accumulated during such a computation must be large).

 

Theorem (slow growth law of minimal history): Let x and y be two strings with respective minimal histories, hx' and hy'.  If |hy'| ³ |hx'|, then the amount of minimal history characterizing the computation of y from input x must be ³ |hy'| - |hx'|.

 

Proof:

1.      Definition: The length of the minimal history of y is given by |hy'|

2.      Definition: The length of the minimal history of x is given by |hx'|

3.      Definition: The length of the minimal history of the computation of y, with x as the only input, is given by |hxy'| = |r"xd"| where r" and d" denote the shortest combination of computer and computation that outputs y from input x that is compatible with |rxd|2|rx| being minimal (condition [1]).

4.      |hx'| + |hxy'| ³ |hy'|.  This is true because an added requirement to produce x (even by a separate computation), cannot lessen the minimum amount of computation required to create y.

5.      By rearrangement, |hxy'| ³ |hy'| - |hx'|

Thus, only if |hy'| - |hx'| is small, is it possible for any added minimal history characterizing a process that produces y from input x to be small (a rapid computation).  On the other hand, if y is deep and x is shallow, then |hxy'| must be large (the computation must be lengthy).

 

A similar proof holds for ½d'½.

 

 

Discussion

This paper presents a theory of computable object complexity which offers specific measures that are complete in the sense that they encompass all computable structure.  Minimal history and Kd characterize the minimal creation of binary objects by balancing the slow process of random search against the length of deterministic computation.  Both measures exponentially weight program length relative to time.  The weighting derives directly from the exponential growth of random searches and the linearity of deterministic computation.  Levin proposed the same weighting when he introduced Kt complexity [4, 8].  The theory is based on the notion that there is an optimal combination of computer (rules), input, and run length for producing any object.  The optimality is established by minimizing the number of binary choices needed to generate an object.  Evaluation of ½h'½ and Kd clearly require that a language be specified, but the basic intuition is language independent. 

 

It is of considerable interest that algorithmic complexity, object length, and minimal history establish a natural hierarchy of complexity measures according to the relationships, K(x) £ ½x½+c and ½x½ < ½h'½.  This hierarchy of information-like complexity measures has the following characteristics: 

 

·        K(x) is the smallest of all information measures of x. It does not include measure of the universal computer specified, and it does not account for redundancies present in x nor for relationships between parts of x that may exist. 

·        ½x½ is a direct measure of x.  It includes measure of redundancy but does not measure internal relationships.

·        ½h'½ provides a comprehensive measure of x by combining minimized measure of computer, input, and the amount of computation required to create x.  It accounts for all internal [computable] relationships and redundancies that may be present in the object.  ½h'½ measures the minimum process for producing x from nothing.

 

The hierarchical relationship K(x) £ ½x½ < ½h'½ is always true within a small constant c.  Significantly, the smallest measure is associated with the greatest constraint, the intermediate measure with intermediate constraint, and the largest measure with least constraint. 

 

By its definition, K(x) cannot increase during deterministic computation.  Since the algorithmic complexity of output x is defined to be the shortest of all inputs to a particular universal machine that yields x as output, x cannot be created from input that can itself be created from shorter input using that same machine. 

 

The length of a string x, ½x½, is closely tied to thermodynamic concepts of entropy and negentropy.  Bennett [9] and Fredkin and Toffoli [10] showed by analyzing reversible computation, that there is no lower limit to the amount of energy required to manipulate information provided that no bit is erased or created.  Erasing a bit irreversibly releases a minimum of kT×ln2 of energy as heat (k is the Boltzmann constant and T the absolute temperature).  It follows that the creation of a bit must require the input of kT×ln2 of energy.  This means that the creation of computational output longer than its input, requires input of a minimum of kT×ln2 of energy per additional bit to write the output to a blank output device.  Thus, the fundamental constraints on ½x½ are that it cannot increase during a reversible computation, and that any change in length that occurs in a non-reversible deterministic computation must be paid for by the input or release of thermodynamic energy. 

 

½h'½ is subject to the least constraint.  It may increase during reversible deterministic computations without the expenditure of thermodynamic energy [assuming an ideal machine].  ½h'½ is constrained only by time (computation steps).

 

Other measures discussed in this paper seem less fundamental than ½h'½.  Thus, ½d'½ provides measure of the minimal computational activity required to produce x, but does not account for all inputs.  Kd = ½r'p'½ + log½h'½ provides complete measure of x, but does not relate as simply to other measures as does ½h'½.  In particular, Kd can be greater or less than ½d'½, depending on the particular values of ½r'p'½ and ½d'½.  In contrast, ½h'½ is always greater than ½d'½.

 

It is important to note that even though ½h'½ and ½d'½ are formally computable (for a specified encoding), they may not be practically computable.  The search space for all but the simplest inputs is infeasibly large.  Thus, the identification of feasibly computable estimators of depth (or minimal history) remains an important challenge.  That this may be possible is suggested by the work of Lathrop [11, 12].  Feasibly computable measures of depth would allow computer experiments that explore the nature of various processes that generate depth. 

 

Finally, it should be recognized that the ideas presented in this paper offer a new way of looking at the subject of object complexity, but that the paper includes no mathematical proofs of great significance and makes no effort to identify specific features of computations that lead to increases in minimal history.  The direct impact of the observations presented is philosophical.  I show that a generalized notion of depth can be directly related to accepted ideas of information and complexity; and that when this is done, the notion of complexity is seen to naturally incorporate aspects of both time and space. 

 

Formally, the ideas presented here apply only to discrete objects, but there are tantalizing similarities between the properties of binary and physical objects.  We are accustomed to the idea that matter is discrete.  It is unknown whether or not space and time are discrete or continuous [13, 14].  If they are discrete, then continuous mathematics only provide approximations of reality and computation is a more fundamental way to describe our universe.  Such a conclusion would also imply that physical objects are subject to a slow growth law and that the entire universe can be described in terms of information creation, its manipulation, and the complixification resulting from that manipulation. 

 

Some writers have recognized the need to formally account for the tendency of our world to complexify – to become more complex over time [14, 15, 16, 17].  Minimal history and depth provide measures of these trends.  In a separate paper, I argue that under certain circumstances the process of evolution may naturally lead to increases in complexity [18] when complexity is defined by any measure that exhibits slow growth.


References

 

1.  Simon, H.A. The architecture of complexity. Proc. Am. Phil. Soc. 1962, 106, 467-482.

2.  Bennett, C.H. Logical depth and physical complexity. In: The Universal Turing Machine: a half-century survey; Herken, R., Ed.; Oxford University Press, Oxford, 1988; pp 227-257.

3.  Juedes, D.W., Lathrop, J.I., and Lutz, J.H. Computational depth and reducibility. Theoretical Computer Science, 1994, 132, 37-70.

4.  Li, M. and Vitanyi, P. An Introduction to Kolmogorov Complexity and Its Applications. 2nd ed.  Springer Verlag, Berlin, 1997.

5.  Lloyd, S. and Pagels, H. Complexity as thermodynamic depth. Ann. Physics, 1988, 188, 186-213.

6.  Crutchfield, J.P. and Shalizi, C.R. Thermodynamic depth of causal states: Objective complexity via minimal representations. Physical Review E, 1999, 59, 275-283.

7. Kolmogorov, A.N. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1965, 1, 1-7.

8.  Levin, L.A.. Information and independence in mathematical theories. Information and Control, 1984, 61, 15-37.

9.  Bennett, C.H. The thermodynamics of computation – a review.  International Journal of Theoretical Physics, 1982, 21, 905-940.

10. Fredkin, E. and Toffoli, T. Conservative Logics. International Journal of Theoretical Physics, 1982, 21, 219-253.

11. Lathrop, J.I. Computing and evolving variants of computational depth.  Ph.D. Dissertation, Iowa State University, Ames, 1997. 

12. Lathrop, J.I. Compression depth and genetic programs. In: Genetic Programming 1997; Koza, J. Ed., Morgan Kaufmann, San Francisco. 1998; pp 370-378.

13. Smolin, L. The Life of the Cosmos. Oxford University Press, Oxford, 1997.

14. Wolfram, S. A New Kind of Science. Wolfram Media, Champaign, 2002

15. Davies, P. The 5th Miracle. Touchstone, New York, 1999.

16. Heylighen, F. The growth of structural and functional complexity during evolution. In: The Evolution of Complexity, the Violet Book of "Einstein Meets Magritte"; Heylighen F. and Bollen J. Eds.; Kluwer, London, 1999; pp 1-16.

17. Kauffman, S. At Home in the Universe. Oxford University Press, Oxford, 1995.

18. Mayfield, J.E. Evolution as Computation. Evolutionary Theory (In press), 2005.