LSE - Small Logo
LSE - Small Logo

Andrew Lewis-Pye

May 30th, 2018

5 Minutes with George Bampalias: algorithmic randomness and learning theory

0 comments

Estimated reading time: 10 minutes

Andrew Lewis-Pye

May 30th, 2018

5 Minutes with George Bampalias: algorithmic randomness and learning theory

0 comments

Estimated reading time: 10 minutes

George Bampalias works at the Institute of Software, Chinese Academy of Sciences, and visited the mathematics department in January 2018. After George gave a very interesting talk on learning theory, he was interviewed by our own Andy Lewis-Pye

 

You’ve worked a lot in the area of “algorithmic randomness”. Could you sum up for the layman, what it’s all about?

Algorithmic randomness is about the existence (or not) of structure or patterns in individual mathematical objects (e.g. a file in your computer, viewed as a string of letters). It is called “algorithmic” because it is with respect to algorithms, i.e. processes that can be executed by computers. Every object is non-random with respect to itself, since it is quite structured and predictable if we already have total access to the object. Hence randomness makes sense with respect to an origin, a fixed point of view, a coordinate system.

Algorithmic information theory allows to study, qualify and quantify the randomness or the predictability of *individual* objects. In contrast, classical information theory  (based on probability) studies the predictability of probabilistic sources, i.e. ensembles of outcomes that occur according to a certain probability distribution.

For example, a sequence of a million 0s (say, as the outcome of coin tosses) is as probable as a million bits from the output of a nuclear decay radiation source (quantum mechanics predicts that such sources are trully unpredictable).  However algorithmic randomness regards the sequence of 0s as highly non-random, as opposed to the seemingly patternless stream of quantum noise. The difference is that a sequence of a million 0s can be produced as the output of a relatively short computer program (or description, in a standard language) while quantum noise does not have such short descriptions.

To sum up, the idea behind algorithmic randomness is that something is random if it does not have short descriptions (in standard languages). This simple idea leads quite fast to the multiple facets of randomness such as incompressibility (a random object is incompressible) and unpredictability.

… and in particular the talk you gave here concerned algorithmic randomness and learning theory. Could you say something about the wider connections between these areas?

Learning theory traditionally aims at understanding the way concepts can be learned by animals (including humans) or machines. In algorithmic learning theory a learner is a machine which typically receives texts from a (formal) language and produces guesses for a grammar that generates the given language, with the expectation that *in the limit* it will eventually settle on a correct guess. Some classes of languages are learnable by machines in this way, while others are not.

Recently it was suggested that one can use this theory in order to approach a fundamental problem in statistics, namely the identification of probability distributions from random data. These days there is an abundance of data collected; probabilistic models of the data can be used to make useful predictions for future outcomes. This basic problem has been approached from all sorts of angles. The new angle that has recently been introduced is to use the methodology of “identification in the limit” from algorithmic learning theory, and require that a learner (machine) eventually produces a description of a probability distribution with respect to which the stream of data that it is reading, is (algorithmically) random. One can see this approach as a combination of traditional algorithmic learning theory with algorithmic information theory. What we recently showed (with Fang Nan from Heidelberg and Frank Stephan from Singapore) is that there are many parallels between this theory of learning from random data and the classic theory of learning from structured text. We constructed tools which allow the direct use of the existing classic theory for the development of this new probabilistic learning theory.

You are based in Beijing, which must be interesting for a westerner? Presumably academia is run quite differently in some ways there?

Working in the Chinese Academy of Sciences in Beijing is great. The research environment is very motivating and there is a lot of potential for research interactions within the institutions here. It is an extremely dynamic environment and I find this both exciting and challenging.

While it is true that some westerners find it a slightly hard to feel at home here, this really depends on one’s personality. Having worked in many places in the past (the U.K., Europe, New Zealand, the U.S.) I think that the differences in the universities are perhaps not as big as most people think. I would say that the defining difference with most western institutions is the frequency of changes (in policies, regulations, employment etc.) which is characteristic in dynamic developing environments. This can mean some lack of security and predictability on the one hand, but also a stable stream of new unexpected opportunities on the other.

The material you work in is often at the interface between mathematics, computer science, and other areas. Could you say something about the differences between the academic communities in these areas?

Many of the stereotypes regarding the mathematics and the computer science community are true to some degree. In mathematics researchers traditionally published less, and did not follow strict deadlines for the completion of their work.

Mathematics conferences are usually rather informal and publication in the proceedings are usually not as high profile as premium journal publications. On the other hand in most areas in computer science there are competitive high-profile conferences which drive much of the research activity, with strict deadlines and rather formal structure. I found that many computer science researchers start a project with a specific conference deadline in mind, rather than a wider goal which is independent of publication prospects. This culture can be both good and bad. The bad side is that the publication volume is much higher in the computer science community and most agree that overall the quality is lowered by the many submissions that are not ready for public dissemination.

The good side is that deadlines drive research, which seems much more streamlined and structured in computer science than in mathematics.

I personally like and try to assimilate both cultures and I don’t regard myself on one side or the other. I would also go as far as to say that the distinction is rather superficial, and in the end what matters is the quality of research, and the impact that it has (sooner or later).

What advice do you have for young mathematicians who are just starting out? Is there anything you would do differently second time around?

I’m reluctant to give advice as such, but here are some thoughts.

Working on a PhD requires a lot of focus on a specific area, even a specific problem, and such an exercise seems necessary in order to gain depth and expertise on a topic. After this stage it is a good idea for a young mathematician to branch out to different topics or even areas, learning new things and using the PhD experience  as a guide. This is perhaps not as easy as continuing working in one’s PhD area, but in the long run it pays off; branching out is also much more interesting. Having said this, such choices also depend on the employment opportunities that one has. The general message here is to keep an open mind for new ideas and concepts that might appear foreign at first, but often are deeply connected to things we already know.

About the author

Andrew Lewis-Pye

Posted In: 5 Minutes with | General

Leave a Reply

Your email address will not be published. Required fields are marked *