The lemma is a statement of the following nature.

*Lemma: For words a, b, A and B in the fundamental group of a surface satisfying some conditions, if ab=AB then a=A and b=B.*

Here are the steps in my discovery of a proof:

- The starting point was a result of Chas-Krongold, which was a special case in a more restricted context, plus knowing from Moira Chas that she had proved (but not published) a result similar to my lemma but in a more restricted context.
- The Chas-Krongold work used what is called
*small cancellation theory.*It is well known that a method that proves the same results in more generality (at the cost of weaker estimates) is $\latex\delta$*-hyperbolicity*, which is applicable in our context. - To use word-hyperbolicity, I tried to construct
*quasi-geodesics*associated to ab and AB. This is what we can try to associate to elements. - The construction of the quasi-geodesics was the most obvious one. The issue was to prove it was a quasi-geodesic. Here I used a non-trivial theorem –
*local quasi-geodesics are quasi-geodesics.* - To verify that what we obtained was a local quasi-geodesic, one used some geometry and another lemma – a lower bound on
*angles.* - To get the lower bound on angles involved using another known trick from the literature –
*consider the commutator and show that it is small.* - Now that we have quasi-geodesics, we use one of the main theorems concerning $\latex\
*delta$-*hyperbolicity, that quasi-geodesics are close to geodesics. - Now we use some geometry parallel to the combinatorics of Chas-Krongold and the trick of considering commutators to finish the proof.

The deductions are not difficult at any point. The main obstacle to automation (say of a computer as a collaborator) is to be able to make the right analogies and dig up the literature. Indeed the analogy is also not that obscure – early papers on $\latex\delta$-hyperbolicity do show that the results include those of small-cancellation theory.

The main barrier is understanding natural language. In the sequel I speculate on a scheme to try this.

]]>*Theorem (Lohkamp): Let be a Riemannian manifold whose dimension is at least three. Then there is a Riemannian metric on , which can be taken to be arbitrarily close to in the metric, so that has negative Ricci curvature.*

Three questions come to mind:

- Why does this need dimension at least three?
- The corresponding result for positive Ricci curvature is false. Why the asymmetry?
- For what other curvature conditions can we prove such a result?

The third question amounts to asking what curvature properties are natural properties of the underlying metric space, or the underlying metric measure space. Both positive and negative sectional curvatures, more generally lower and upper bounds on sectional curvatures, can be expressed in terms of *comparison triangles.* This makes them metric properties. More recently it has been shown that *lower bounds* on Ricci curvature can be expressed in terms of so called *optimal transport* on metric measure spaces.

On the other hand, Lohkamp’s theorem shows that negative Ricci curvature is not an interesting natural property of metric measure spaces. More precisely, the set of Riemannian metrics with negative Ricci curvature is *dense* in the space of all Riemannian metrics – in particular there are no topological restrictions. In general, we would like to know which properties are dense. In particular there are no topological restrictions.

One can, to some extent, answer the first two questions relatively easily, but the answers are illuminating. I return to these in sequels. For now, I will just end with a moral from my previous failure.

*Moral: We cannot avoid hard analysis. *My previous attempts all amounted to attempting to reduce the problem to understanding the *dominant term.* However one (presumably) cannot avoid the stubborn persistence of a pair of comparable terms of opposite signs, one only slightly larger than the other. In this case, and many others that also are hard to intuitively understand, qualitative properties are determined by quantitative behaviour of the underlying system.

Gromov observed that in place of the logarithm of the cardinality, one can use any other quantity that is additive (and monotonic). Indeed there is another familiar additive (and monotonic) quantity: *dimension. *By using dimension in this way, Gromov introduced the idea of *mean dimension* of the space of holomorphic maps, which is analogous to the density of entropy.

Here I will comment on building on Gromov’s idea in a couple of different ways. The starting point in both cases is that there is an obvious connection to Sheafs.

*Sheafs and Subadditivity: *Suppose we have a *sheaf* on a topological space *. *This means that we associate to each open set $U\subset latex X$ an object – here a set , perhaps with the structure of a vector space. If , there is a corresponding restriction map from to . Further, given elements and $s_V\in latex S(V)$ whose restrictions to agree, there is a unique element in whose restrictions to and are and . The obvious examples of sheafs are continuous (smooth, holomorphic) functions on . Gromov considers sheafs of holomorphic functions with appropirate bounds to ensure finite-dimensionality.

Suppose we are given a sheaf and an appropriate notion of entropy for the sets with the given structure – for example, logarithm of the cardinality or dimension of vector spaces. We can then associate to open sets the function . Additivity and monotonicity then give:

.

*Independence and Conditional Entropy: *If , then we see that , so the sets (or rather the systems with these sets as domains) are independent. We can take this to be the definition of independence. More generally, we can define the conditional entropy:

This satisfies , with equality in the second inequality characterising independence.

*Domains of holomorphy:*

We can consider the sheaf of holomorphic functions satisfying some appropriate conditions ensuring finite dimensionality: $latex\frac{\partial f}{\partial z_i}\leq Cf$ seem promising. We can then consider the function . We get associated conditional entropies.

If is contained in the holomorphic convex hull of , then . Thus, we get a nice refinement of notions like domains of holomorphy.

*Graphs and Colourings:*

Given a countable graph, we consider its set of vertices as a discrete set. We can associate to a set of vertices the set of colourings of the vertices in using red, blue and green, so that adjacent vertices have different colours. This gives a sheaf, and the entropy may give insights into 3-colourings.

]]>In a complex world, with a favourable outcome depending on many factors (including chance), learning only from ultimate successes and failures is limited. It is better to reward actions that lead to success – solving puzzles, hard work, learning, etc. It is even better to develop ones *judgement* and *taste*, i.e., to know what to reward by how much.

While computers today have many wonderful reasoning powers – being the best chess players and mastermind quizzers, they seem to lack the taste and judgement that will lead to deeper discovery. But one can argue that this is because we forgot to put these qualities in them. While automated reasoning functions do have *reward functions*, and even compete for survival, computers are not set up to decide, or even fine tune, what deserves a reward. A good way to introduce such taste may be to mimic the interplay between our own cognitive and emotional systems, and their relation to the world outside. Many computers may simply become tetris addicts, but perhaps a few deep thinkers will emerge.