A Very Large Scale Pattern Matching Problem in MathTensor


This is an unevaluated Mathematica notebook. If you do not have a copy of Mathematica handy, you may want to also view the evaluated version

Steven M. Christensen

MathSolutions, Inc.
Steven M. Christensen and Associates, Inc.

steve@smc.vnet.net

Based on a talk presented at the Mathematica Developer's Conference, Urbana, Illinois, October, 1995. This work is part of a collaboration with Dr. Stephen A. Fulling of the Mathematics Department at Texas A & M University.

Historical Introduction

The main reason I started working with SMP and then Mathematica was to do calculations in quantum field theory and gravitation. In these fields it is very common for incredibly large so-called symbol crunching calculations to be generated. Equations with thousands or even millions of terms can appear. In many cases it is necessary to apply a number of rules to each term in order to simplify the expressions down from many terms to a few (maybe tens or hundreds). One of the main problems we found can be explained simply:


  Sum[coeff[i] RiemannTerm[i],{i,1,25}]

Each coeff[ ] is some exact ratio of integers. Each RiemannTerm[ ] is some potentially complicated tensorial expression. There may be a few million terms of this kind. We know that almost all of these terms can be written as a linear combination of a much smaller number of "basis" RiemannTerms. So we might have

RiemannTerm[1], RiemannTerm[2], RiemannTerm[3]

as the basis and a lot of rules relating the other terms to these as in

RiemannTerm[25] -> rulecoeff[1] RiemannTerm[1] +
rulecoeff[2] RiemannTerm[2] +
rulecoeff[3] RiemannTerm[3]

In our calculations, we need to have the computer to the following things:

Load MathTensor


  <<MathTensor.m

Generate the equations we want to work with.

We already know that the terms in our equations will be constructed from products of the Riemann tensor, the object that measures the curvature of spacetimes. The Riemann tensor has a very well known structure. It is an object with four tensorial indices and these indices have an important set of symmetries.

The Riemann tensor has four indices, here "covariant" lower indices as entered in MathTensor.


  RiemannR[la,lb,lc,ld]

This object is antisymmetric under interchange of the first pair or second pari of indices.


  RiemannR[lb,la,lc,ld]


  RiemannR[la,lb,ld,lc]

The Riemann tensor is symmetric under interchange of the two pairs.


  RiemannR[lc,ld,la,lb]

If one index is raised to its contravariant position and summed with its corresponding lower index, the Riemann tensor is equal to the so-called Ricci tensor which is symmetric in its two indices.


  RiemannR[la,lb,ua,ld]

Because of the antisymmetries in the pairs of indices, when one of the pairs is summed, the result is zero.


  RiemannR[la,ua,lc,ld]

If the indices of the Riemann tensor are summed across pairs or on the indices on the Ricci tensor, we obtain the Riemann Scalar.


  RiemannR[la,lb,ua,ub]

MathTensor understands these properties and applies them automatically to any terms generated. Many important rules are included in a set of RiemannRules. Other rules are also available, like the cyclic identity:


  ?RiemannCyclicSecondThreeRule


  RiemannR[lc,la,ld,lb] /. 
  RiemannCyclicSecondThreeRule[lc,la,ld,lb]


  RiemannR[la,lb,lc,ld] RiemannR[ua,ud,ub,uc]


  ApplyRules[%,RiemannRules]

MathTensor knows all of the standard rules for terms up to the product of two Riemann tensors and its derivatives. It also knows some of the products of three Riemann tensor rules.

Here is a typical relatively small equation that appears in our work.


  <<sigmarules.m


  CD[sigma,la,lb,lc,ld,le,lf]


  ApplyRules[%,SigmaLimitSixRule]

It is important to note that any index out of place or error in sign, etc. will ultimately lead to total nonsense in physical results.

Allow us to define the rules we need.

To determine what rules we need, we must have several things:

It is extremely difficult to determine the size of the minimal set of linearly independent Riemann tensor products. It requires very subtle use of group theoretical arguments like Young tableaux. Consider just one Riemann tensor term's possible indices with no summations:


  indices40 = {la,lb,lc,ld}


  all40 = Permutations[indices40]


  Table[RiemannR[Unlist[all40[[i]]]],{i,1,Length[all40]}]


  Union[% /. {-1 -> 1}]

So we see that the best MathTensor can do using the pre-defined Riemann tensor symmetries is reduce the list to three terms. Any equation we generate will have at most 3 terms. But group theory and well-known identities tell us that the dimension of the minimal set is 2. We need just 1 rule, the cyclic identity to reduce any expression down to 2 terms.

We can also consider situations with summed indices via


  indices41 = {la,lb,lc,uc}


  all41 = Permutations[indices41]


  Table[RiemannR[Unlist[all41[[i]]]],{i,1,Length[all41]}]


  Rest[Union[% /. {-1 -> 1}]]

There is only one object with two indices built out of one Riemann tensor. Now we can consider the case where we have a product of two Riemann tensors (with no derivatives for now) and 8 free indices. We can generate all possible permutations of these 8 indices and then split the list up into pairs of four indices.


  indices80 = {la,lb,lc,ld,le,lf,lg,lh}


  all80 = Permutations[indices80];


  all80part = Map[Partition[#,4]&,all80];


  all80part[[1]]


  Riemann80 = Table[RiemannR[Unlist[all80part[[i,1]]]]*
  RiemannR[Unlist[all80part[[i,2]]]],{i,1,Length[all80part]}];


  Riemann80[[1]]

This generates a lot of terms - 8! - many of which are equal or differ by a sign only.


  Length[Riemann80]


  Riemann80a = Union[Riemann80 /. {-1 -> 1}]


  Length[%]

So, we see that MathTensor will generate at most 315 two Riemann tensor structures with 8 free indices. From group theoretical calculations we find that there are really only 140 terms in the minimal basis. So at first sight we will have to generate 175 rules. But if we look at the terms, we immediately see that terms like RiemannR[la,le,lb,lc], where the second pair of indices is lexically before the second index, can be rewritten in the form RiemannR[la,lb,lc,le] or RiemannR[la,lc,lb,le] using the cylic identity alone. How many terms does this leave?

In order to determine this we must first have a rule that checks each term above to see if it can be rewritten:


  RuleUnique[CyclicRule, RiemannR[la_,lb_,lc_,ld_],
  - RiemannR[la,lc,ld,lb] - RiemannR[la,ld,lb,lc],
  IndicesAndNotOrderedQ[{lb,lc}] && IndicesAndNotOrderedQ[{lb,ld}]]


  Riemann80a /. CyclicRule

For our counting purposes we can remove terms from our list by defining a "fake" rule which simply sets to zero any term that is changed by CyclicRule:


  RuleUnique[FakeCyclicRule,RiemannR[la_,lb_,lc_,ld_],
  0,IndicesAndNotOrderedQ[{lb,lc}] &&
  IndicesAndNotOrderedQ[{lb,ld}]]


  Rest[Union[Riemann80a /. FakeCyclicRule]]


  Length[%]

Aha! We don't need 175 rules, just 1! if we apply CyclicRule.

Now lets look at the case where we sum over the four pairs and generate scalars.


  Union[%% /. {lb->ua,ld->uc,lf->ue,lh->ug}]

We use the MathTensor Canonicalize function to rename summed dummy indices in each term and obtain


  Rest[Union[Map[Canonicalize[#]&,%] /. {-1 -> 1}]]

We see now that when we sum the indices on the minimal set above, we get 4 possible scalar objects with 2 Riemann tensors. But, we know that there can be only 3 independent objects in the minimal scalar set. So, even though the 2 Riemann tensor product set above is a minimal set, the scalar set derived from it is not. It is possible to show that the fourth term above is related to the third via


  RuleUnique[TwoRiemannSummedRule,
  RiemannR[la_,lb_,lc_,ld_] * RiemannR[ua_,ub_,uc_,ud_],
  RiemannR[la,lb,lc,ld] RiemannR[ua,ub,uc,ud]/2,
  PairQ[la,ua] && PairQ[lb,uc] && PairQ[lc,ub] && PairQ[ld,ud]]


  ApplyRules[%%[[4]],TwoRiemannSummedRule]

So using the CyclicRule and TwoRiemannSummedRule we can reduce any scalar expression with 2 Riemann tensor products down to a minimal size.

If we want to go to three Riemann tensors and then try to generate all the permutations we will have 12! = 479,001,600 terms. We will get Out of Memory errors. Even if we could generate that many terms, it is a waste of resources since we know some things about what MathTensor will do. We know that the la and lb indices can only be in two possible configurations due to the symmetries of the RiemannTensor and the lexical ordering of terms that Mathematica does. If we do the permutations on the 10 remaining indices we get 8 x 9! = 2,903,040 terms.

Group theory tells us that there are 46,200 terms in the minimal set! So to illustrate a simpler situtation we can look at the scalars with three Riemann tensors. We need to find a minimal set of 8. It is easy to show that all the terms that MathTensor could generate after the use of Canonicalize and CyclicRule will have the forms


The first terms are obviously just


  terms1 = Map[Canonicalize[#]&,{ScalarR^3, ScalarR * RicciR[la,lb] RicciR[ua,ub],
  ScalarR * RiemannR[la,lb,lc,ld] RiemannR[ua,ub,uc,ud]}]

We might ask whether terms like


  RicciR[la,lb] RicciR[lc,ld] RiemannR[ua,ub,uc,ud]

might not be generated even though it vanishes due to the symmetries of the Ricci and Riemann tensors. Clearly since I can type in this term and it does not automatically vanish, the term can appear. So, we will need to apply the Tsimplify function to our three Riemann tensor scalar expressions to eliminate such terms as with


  Tsimplify[%]

Now we start off with the first Ricci tensor type term:


  indices6 = {lc,uc,ld,ud,le,ue}


  all6 = Permutations[%];


  all6a = Map[Insert[#,ua,1]&,all6];


  all6ab = Map[Insert[#,ub,3]&,%];


  all6ab[[1]]


  all6abpart = Map[Partition[#,4]&,all6ab];


  Riemann6ab = Table[RicciR[la,lb] * 
  RiemannR[Unlist[all6abpart[[i,1]]]]*
  RiemannR[Unlist[all6abpart[[i,2]]]],{i,1,Length[all6abpart]}]


  Rest[Union[Riemann6ab /. {-1 -> 1}]]


  Union[Map[Canonicalize[#]&,%]]


  terms2 = Union[% /. {-1 -> 1}]


  terms = Union[terms1, terms2]

Now, the second Ricci term types:


  all6ab2 = Map[Insert[#,ub,5]&,all6a];


  all6ab2[[1]]


  all6ab2part = Map[Partition[#,4]&,all6ab2];


  Riemann6ab2 = Table[RicciR[la,lb] * 
  RiemannR[Unlist[all6ab2part[[i,1]]]]*
  RiemannR[Unlist[all6ab2part[[i,2]]]],{i,1,Length[all6ab2part]}]


  Rest[Union[Riemann6ab2 /. {-1 -> 1}]]


  Union[Map[Canonicalize[#]&,%]]


  terms3 = Union[% /. {-1 -> 1}]


  terms = Union[terms, terms3]


  indices4 = {le,ue,lf,uf}


  all4 = Permutations[%]


  all4b = Map[Insert[#,ub,1]&,all4];


  all4ab = Map[Insert[#,ua,1]&,all4b];


  all4abc = Map[Insert[#,ud,5]&,all4ab];


  all4abcd = Map[Insert[#,uc,5]&,all4abc];


  all4abcd[[1]]


  all4abcdpart = Map[Partition[#,4]&,all4abcd]


  Riemann4abcd = Table[RiemannR[la,lb,lc,ld] * 
  RiemannR[Unlist[all4abcdpart[[i,1]]]]*
  RiemannR[Unlist[all4abcdpart[[i,2]]]],{i,1,Length[all4abcdpart]}]


  Rest[Union[% /. {-1 -> 1}]]


  Union[Map[Canonicalize[#]&,%]]


  Union[% /. {-1 -> 1}]


  terms = Union[terms,%]


  all4


  all4a = Map[Insert[#,ua,1]&,all4];


  all4ab2 = Map[Insert[#,ub,3]&,all4a];


  all4abc2 = Map[Insert[#,uc,5]&,all4ab2];


  all4abcd2 = Map[Insert[#,ud,7]&,all4abc2];


  all4abcd2[[1]]


  all4abcd2part = Map[Partition[#,4]&,all4abcd2];


  Riemann4abcd2 = Table[RiemannR[la,lb,lc,ld] * 
  RiemannR[Unlist[all4abcd2part[[i,1]]]]*
  RiemannR[Unlist[all4abcd2part[[i,2]]]],{i,1,Length[all4abcd2part]}]


  Union[Map[Canonicalize[#]&,%] /. {-1 -> 1}]


  Map[Tsimplify[#]&,%]


  terms = Union[terms,Rest[%]]


  all4


  all4a = Map[Insert[#,ua,1]&,all4];


  all4ab3 = Map[Insert[#,ub,3]&,all4a];


  all4abc3 = Map[Insert[#,uc,5]&,all4ab3];


  all4abcd3 = Map[Insert[#,ud,7]&,all4abc3];


  all4abcd3[[1]]


  all4abcd3part = Map[Partition[#,4]&,all4abcd3];


  Riemann4abcd3 = Table[RiemannR[la,lb,lc,ld] * 
  RiemannR[Unlist[all4abcd3part[[i,1]]]]*
  RiemannR[Unlist[all4abcd3part[[i,2]]]],{i,1,Length[all4abcd3part]}]


  Union[Map[Canonicalize[#]&,Riemann4abcd3] /. {-1->1}]


  Union[Map[Tsimplify[#]&,%]]


  terms = Union[terms,Rest[%]]


  Length[%]

So using the symmetries of the Riemann tensor, its contractions to the Ricci tensor and Riemann scalar, and the Canonicalize and Tsimplify functions we can reduce any scalar expression into the 9 terms above. This is one more than we need to have a minimal set.

If we now apply CyclicRule,


  terms /. CyclicRule


  Map[Canonicalize[#]&,%]


  Union[% /. {-1 -> 1,-2 -> 1}]


  Length[%]

So, with the use of CyclicRule and Canonicalize, we finally get a list of 8 independent terms and a minimal set we can use. We also learn that we only need the CyclicRule added to reduce any expression to 8 terms.

Simplify the result down to the smallest equation possible.

Lets go back and look at the non-derivative Riemann tensor terms in the long six index sigma expression we showed earlier. We will drop the derivative terms


  Drop[%16,12]


  Length[%]


  %% /. CyclicRule


  Expand[%]
  


  Length[%]


  Tsimplify[%%]


  Length[%]


  RuleUnique[afixrule,RiemannR[b_,c_,d_,e_] *
  RiemannR[f_,g_,h_,i_], RiemannR[Raise[b],c,d,e]*
  RiemannR[f,g,h,Lower[i]],
  PairQ[b,i] && !SameQ[la,f]]


  %%% /. afixrule


  Length[%]


  %% /. CyclicRule


  Expand[%]


  Length[%]

So we can reduce our original expression to a minimal set with the application of just a few simple rules.

Go up to Steven M. Christensen and Associates, Inc. Home Page