Algorithmic randomness is about the existence (or not) of structure or patterns in individual mathematical objects (e.g. a file in your computer, viewed as a string of letters). It is called “algorithmic” because it is with respect to algorithms, i.e. processes that can be executed by computers. Every object is non-random with respect to itself, since it is quite structured and predictable if we already have total access to the object. Hence randomness makes sense with respect to an origin, a fixed point of view, a coordinate system.

Algorithmic information theory allows to study, qualify and quantify the randomness or the predictability of *individual* objects. In contrast, classical information theory (based on probability) studies the predictability of probabilistic sources, i.e. ensembles of outcomes that occur according to a certain probability distribution.

For example, a sequence of a million 0s (say, as the outcome of coin tosses) is as probable as a million bits from the output of a nuclear decay radiation source (quantum mechanics predicts that such sources are trully unpredictable). However algorithmic randomness regards the sequence of 0s as highly non-random, as opposed to the seemingly patternless stream of quantum noise. The difference is that a sequence of a million 0s can be produced as the output of a relatively short computer program (or description, in a standard language) while quantum noise does not have such short descriptions.

To sum up, the idea behind algorithmic randomness is that something is random if it does not have short descriptions (in standard languages). This simple idea leads quite fast to the multiple facets of randomness such as incompressibility (a random object is incompressible) and unpredictability.

Learning theory traditionally aims at understanding the way concepts can be learned by animals (including humans) or machines. In algorithmic learning theory a learner is a machine which typically receives texts from a (formal) language and produces guesses for a grammar that generates the given language, with the expectation that *in the limit* it will eventually settle on a correct guess. Some classes of languages are learnable by machines in this way, while others are not.

Recently it was suggested that one can use this theory in order to approach a fundamental problem in statistics, namely the identification of probability distributions from random data. These days there is an abundance of data collected; probabilistic models of the data can be used to make useful predictions for future outcomes. This basic problem has been approached from all sorts of angles. The new angle that has recently been introduced is to use the methodology of “identification in the limit” from algorithmic learning theory, and require that a learner (machine) eventually produces a description of a probability distribution with respect to which the stream of data that it is reading, is (algorithmically) random. One can see this approach as a combination of traditional algorithmic learning theory with algorithmic information theory. What we recently showed (with Fang Nan from Heidelberg and Frank Stephan from Singapore) is that there are many parallels between this theory of learning from random data and the classic theory of learning from structured text. We constructed tools which allow the direct use of the existing classic theory for the development of this new probabilistic learning theory.

Working in the Chinese Academy of Sciences in Beijing is great. The research environment is very motivating and there is a lot of potential for research interactions within the institutions here. It is an extremely dynamic environment and I find this both exciting and challenging.

While it is true that some westerners find it a slightly hard to feel at home here, this really depends on one’s personality. Having worked in many places in the past (the U.K., Europe, New Zealand, the U.S.) I think that the differences in the universities are perhaps not as big as most people think. I would say that the defining difference with most western institutions is the frequency of changes (in policies, regulations, employment etc.) which is characteristic in dynamic developing environments. This can mean some lack of security and predictability on the one hand, but also a stable stream of new unexpected opportunities on the other.

Many of the stereotypes regarding the mathematics and the computer science community are true to some degree. In mathematics researchers traditionally published less, and did not follow strict deadlines for the completion of their work.

Mathematics conferences are usually rather informal and publication in the proceedings are usually not as high profile as premium journal publications. On the other hand in most areas in computer science there are competitive high-profile conferences which drive much of the research activity, with strict deadlines and rather formal structure. I found that many computer science researchers start a project with a specific conference deadline in mind, rather than a wider goal which is independent of publication prospects. This culture can be both good and bad. The bad side is that the publication volume is much higher in the computer science community and most agree that overall the quality is lowered by the many submissions that are not ready for public dissemination.

The good side is that deadlines drive research, which seems much more streamlined and structured in computer science than in mathematics.

I personally like and try to assimilate both cultures and I don’t regard myself on one side or the other. I would also go as far as to say that the distinction is rather superficial, and in the end what matters is the quality of research, and the impact that it has (sooner or later).

I’m reluctant to give advice as such, but here are some thoughts.

Working on a PhD requires a lot of focus on a specific area, even a specific problem, and such an exercise seems necessary in order to gain depth and expertise on a topic. After this stage it is a good idea for a young mathematician to branch out to different topics or even areas, learning new things and using the PhD experience as a guide. This is perhaps not as easy as continuing working in one’s PhD area, but in the long run it pays off; branching out is also much more interesting. Having said this, such choices also depend on the employment opportunities that one has. The general message here is to keep an open mind for new ideas and concepts that might appear foreign at first, but often are deeply connected to things we already know.

]]>**What are you currently researching?
**I am mostly interested in discrete geometry, which includes combinatorial questions about geometric objects. Sometimes I also think about purely combinatorial questions, for example I am currently working on a question about partitioning edge coloured hypergraphs into monochromatic cycles.

**Why did you choose this area of study?
**I like combinatorics and geometry, and this area is a mixture of these two.

**What do you hope to do career-wise, long term?
**I would like to stay in academia and do research.

**Can you provide any advice to prospective students about the most effective way to approach research and keep stress levels down?
**Of course, this varies from individual to individual and from area to area, but there are some things that can be useful in general. Set realistic expectations: you should not anticipate finding results quickly. It is a slow procedure. For me the recipe is to try to be happy even with small results: don’t let failure disappoint you too greatly. I think it is also good to not separate weekdays and weekends too much; when you have ideas and feel motivated, don’t stop for the weekend, but treat yourself to much-needed rest days later.

**What resources are available at LSE to help young researchers?
**There are several funds at both School and Departmental level. Mathematicians need whiteboards – we’re lucky to have many in our PhD office, plus all the basic provisions we could ever need (stationery, printing, equipment, etc.). Our PhD Office itself is a really good, productive environment to work in, where we can focus solidly on our research but also collaborate and share thoughts. The Department as a whole, alongside the PhD Academy and our Research Manager, assist with the essential practicalities of PhD study. The Department invites key visitors to present at our seminar series. Crucially, we have a fantastic coffee machine in the Department

**In a few words, what is the best thing about studying at LSE?
**Everyone is very nice; I am a valued member of the Department.

Cathy visited the Department of Mathematics, LSE in July 2017 to present a Public Lecture entitled “Weapons of Math Destruction: how big data increases inequality and threatens democracy“, related to her book of the same name. Whilst visiting, she took the time to talk with Martin Anthony and Andy Lewis-Pye (LSE Maths) about how she came to be an author, her latest book, how to define these ‘weapons’ and what the future holds.

A podcast featuring highlights from Cathy’s Public Lecture is available here. The introduction is provided by Martin Anthony (Head of the Department of Mathematics). The recording ends with a great selection of Q&As from a very enthusiastic, engaged audience.

]]>I’ve just been at a conference called “Kinks 2”, or that might be “knkx 2” (I never saw it written down)*(NB from blog editor: it was KnKx2)*. But look at the list of topics: dynamics, functional equations, infinite combinatorics and probability. How could I resist that?

Of course, with a conference like this, the first question is: is the conference on the union of these topics, or their intersection? The conference logo strongly suggests the latter. But as usual I was ready to plunge in with something connecting with more than one of the topics. I have thought about functional equations, but only in the power series ring (or the ring of species); but it seemed to me that sum-free sets would be the best fit: there is infinite combinatorics and probability; there is no dynamics, but some of the behaviour looks as if there should be!

The organisers were Adam Ostaszewski and Nick Bingham. So the conference was centred on their common interests. I will say a bit about this in due course. Unfortunately, for personal reasons, I was not able to travel south until Monday morning, and even getting up at 6am I didn’t get to the conference room until a little after 2pm. So I missed the opening talk by Nick Bingham (which no doubt set the scene, and according to Charles Goldie was a sermon) and the talk by Jaroslav Smítal ; I arrived shortly after the first afternoon lecture, by Vladimir Bogachev, had begun.

I will try to give an overview of what I heard, but will not cover everything at the same level. Of course, a topologist, analyst or probabilist would give a very different account!

Vladimir’s talk was something which tied in to the general theme in a way I didn’t realise until later. He talked about topological vector spaces carrying measures; in such a space, consider the set of vectors along which the measure is differentiable (in one of several senses), one of which is apparently called the Cameron–Martin space. He had something called *D _{C}* and explained that he originally named it after a Russian mathematician whose name began with S (in transliterated form); he had used the Cyrillic form, never imagining in those days that one day he would be lecturing about it in London!

David Applebaum talked about extending the notion of a Feller–Markov process from compact topological groups to symmetric spaces, and considering five questions that have been studied in the classical case, asking how they extend. He wrote on the board, which didn’t actually slow him down much. David said “I want to prove something”, but Nick, who was chairing, said “No, just tell the story”.

Finally for the day, Christopher Good explained two concepts to us: *shifts of finite type* (this means closed subspaces of the space of sequences over a finite alphabet, closed under the shift map, and defined by finitely many excluded subwords), and *shadowing* (this is relevant if you are iterating some function using the computer; typically the computed value of *f*(*x _{i}*) will not be exact, but will be a point

I was the first speaker next morning. I arrived half an hour early, to find the coffee laid out but nobody else around. Soon Rebecca Lumb came along and logged in to the computer, so I could load my talk. I found that the clicker provided had the feature that the left button advances the slides, so I took it out and put in my own, which works the right way round. The talk went well, and I enjoyed a gasp of surprise from the audience when I displayed my empirical approximation to the density spectrum of a random sum-free set. My last slide, a picture of an apparently non-periodic sum-free set generated by a periodic input, was also admired. It was suggested that a series of such pictures, in striking colours, would be worthy of an art exhibition. The slides are available here.

After a coffee break, Imre Leader spoke about sumsets (so not too far from my talk but not at all the same). As usual, he wrote on the board. The question was: if you colour the natural numbers with finitely many colours, is there an infinite set for which all pairwise sums have the same colour? The answer is yes if you take pairwise sums of distinct elements. (Colour the pair {*i,j*} with the colour of *i*+*j*. By Ramsey’s theorem, there is an infinite set with all pairs of the same colour; this does it!). The first surprise was that if you allow *x*+*x* as a sum as well, then it is impossible; he showed us the nice two-step argument for this. (The first step is simple; you can always colour so that *x* and 2*x* have different colours: take two colours, and give x the colour red if the exponent of 2 in the factorisation of *x* is even, blue if it is odd. The second step a bit more elaborate.) What about larger systems? He showed us why the answer is No in the integers, and No in the rationals, but (surprisingly) Yes in the reals, if you assume the Continuum Hypothesis (and indeed, Yes in a vector space of sufficiently large dimension over the rationls (precisely, beth_{ω}).

First after lunch was Dugald Macpherson, who talked about automorphism groups of countable relational structures, especially closed oligomorphic groups (and more specially, automorphism groups of homogeneous structures over finite relational languages). His talk was in three parts, of which he skipped the second for lack of time. The first part was stuff I knew well, the connection between Ramsey structures and topological dynamics (the Kechris–Pestov–Todorcevic theorem and what happened after). The second part would have been about simplicity. The third concerned the existence of “ample generics” and applications of this to many things including the small index property, uncountable cofinality, and the Bergman property, and then sufficient conditions for ample generics (with names like EPPA and APPA).

Eliza Jabłońska talked about “Haar meager sets”, an ideal of subsets of a topological group having many similarites to the Haar null setes in the case of a locally compact group. Some, but not all, of the classic results about measure and category for the real numbers go through to this situation.

Finally, Janusz Brzdęk talked about the generalised Steinhaus property. In loose terms, this says that, if *A* is a subset of a topological group which is “large enough and decent”, then *A*−*A* has non-empty interior, and more generally, if *A* is as above and *B* is “not small”, then *A*−*B* has non-empty interior. This kind of result has applications in the theory of functional equations, for example showing that if you have a solution of *f*(*x*+*y*) = *f*(*x*)+*f*(*y*) in a large enough and decent subset of G then this solution can be extended to the whole of *G*. There are also applications to “automatic continuity” (but I don’t know what this is). He started off with some very general results which apply in any magma (set with a binary operation) with identity. You have to redefine *A*−*B* in such a case, since inverses don’t exist; it is the set of *z* for which *z*+*B* intersects *A*. He went on to a discussion of microperiodic functions (which have arbitrarily small periods): on the reals, such a function, if continuous, must be constant, and if Lebesgue measurable, must be constant almost everywhere. There are also approximately microperiodic functions, sub-microperiodic functions, and so on.

Then off to a pleasant conference dinner at the local Thai restaurant, where conversation ranged over doing, teaching and understanding mathematics, along with many other topics.

Wednesday was a beautiful day, so I walked in to the LSE, past St Pauls and down Fleet Street.

The first speaker, Harry Miller, was talking to us by skype from Sarajevo, since his health is not good enough to allow him to make the trip. The technology worked reasonably well, though the sound quality was not great and the handwritten slides were packed with information. He gave us half a dozen different conditions saying that a subset of the unit interval is “large” (not including the well-known ones, measure 1 and residual), and a number of implications between them and applications. One nice observation: are there compact sets *A* and *B* such that *A*+*B* contains an interval but *A*−*B* doesn’t? Such sets had been constructed by “transfinite mumbo-jumbo”, but he showed us a simple direct construction: *A* is the set of numbers in [0,1] whose base-7 “decimal” expansion has only digits 0, 4 and 6, while *B* is the set using digits 0, 2 and 6.

After this, Marta Štefánková talked about hyperspace: this is the space *K*(*X*) of all compact subsets of a compact metric space *X*, with the Hausdorff metric. Given a map *f*, how do the dynamics of *f* on *X* relate to the dynamics of the induced map on *K*(*X*)? She introduced four kinds of “Li–Yorke chaos”: a map can be generically, or densely, chaotic or epsilon-chaotic. There are a number of implications between the resulting eight situations, but some non-implications as well; almost everything is known if *X* is the unit interval but in other cases there are still many mysteries.

Adrian Mathias, whom I haven’t seen for donkeys years, talked about connections between descriptive set theory and dynamics (he said, inspired by a trip to Barcelona, where he found the dynamicists sadly lacking in knowledge about descriptive set theory). The subtitle was “analytic sets under attack”. (The basic idea is that iteration of a map *f* is continued into the transfinite (I missed the explanation of how this is done), and *x* attacks *y* if there is an increasing sequence of ordinals so that the corresponding sequence of iterations of *f* applied to *x* has limit *y*.)

Dona Strauss talked about the space β**N**, the Stone–Cech compactification of the natural numbers (realised as the set of ultrafilters on the natural numbers). It inherits a semigroup structure from **N**, and the interplay of algebra and topology is very interesting. Her main result was that many subsets of β**N** which are very natural from an algebraic point of view are not Borel sets: these include the set of idempotents, the minimal ideal, any principal right ideal, and so on. (The Borel sets form a hierarchy, but any beginners’ text on descriptive set theory tells you not to worry too much, all interesting sets are Borel, and are in fact very low in the hierarchy.)

Peter Allen took time out from looking after his five-week-old baby to come and tell us about graph limits. It was a remarkable talk; I have heard several presentations of that theory, including one by Dan Kral in Durham, but Peter managed to throw new light on it by taking things in a different order. He also talked about some applications, such as a version of the theory for sparse graphs, and some major results already found using this approach: these include

- a considerable simplification of the Green–Tao theorem that the primes contain arbitrarily long arithmetic progressions; and
- a solution of an old problem: given two bounded sets A, B in Euclidean space of dimension at least 3, having the same measure, each of which “covers” the other (this means finitely many congruent copies of A cover B and vice versa), there is a finite partition of A into measureable pieces, and a collection of isometries (one for each piece) so that the images of the pieces under the isometries partition B.

Finally it was time for Adam’s talk. His title was “Asympotic group actions and their limits”, but he had the brilliant subtitle for a talk at the London School of Economics and Political Science: “On quantifier easing”. He explained how the notion of regular variation had led him to various functional equations, the simplest of which is additivity, and then he had discovered that these equations were actually the same up to a twist. There was quite a lot of technical stuff, and my concentration was beginning to fade, so I didn’t catch all the details.

That was the end of the conference, and we all went our separate ways.

]]>*Additional course commentry is generously provided by Michael Seal (London School of Economics, Department of Mathematics, BSc Mathematics with Economics, Class of 2016).*

Although this is a personal account, the success of the course is the result of the efforts of many people, and I must begin by thanking all of them for their contributions.

By about 2010 the fledgling Mathematics Department at the LSE had grown to the point where it could be compared with long-established departments in other UK universities. In particular, the new degree in Mathematics with Economics provided the opportunity to give our students a broad background in mathematics. I had dabbled in the History of Mathematics for many years and, when I mentioned the possibility of a course in that subject, I was encouraged by the then Head of Department, Jan van den Heuvel. One of my qualifications was that I had some experience of the famous OU (Open University) course, having served as an External Examiner for several years.

So there emerged a plan to set up a course of about twenty lectures and ten classes, covering the main events in the history of mathematics from the dawn of civilisation to the present day. In order to give the course a distinctive flavour, the applications of mathematics in economics and finance would be stressed. Also, the course would reflect the recent trend towards a broad view of mathematics in all its forms, rather than the traditional approach based on ‘famous mathematicians and their theorems’. The teaching would be based on the OU model, using mainly ‘gobbets’ taken from historical sources. In the coursework and the examination, students would be expected to comment on content, context, and historical significance of these gobbets. There would also be an assessed coursework essay, counting for 30% of the final mark. It was hoped that the students would thereby acquire the skill of writing about technical mathematics in narrative form. For more information on the course, current students can visit this Moodle page.

At an early stage I was lucky enough to involve Robin Wilson, an old friend and co-author, who also had experience of the OU system. Like me, he was officially ‘retired’, and was able to bring much-needed expertise and enthusiasm to the project. After some preparatory work we were able to satisfy the LSE’s regulations, and the course MA318 History of Mathematics in Finance and Economics was offered for the first time in the academic year 2012-13. There were six students, and contributions from four teachers. The course materials were produced in rather crude way but, by and large, the course went well. Nevertheless, there were obvious lessons to be learned, and in order to make the course more coherent, all the lectures and the classes in 2013-14 were done by Robin and myself. Another innovation was to encourage active involvement of the students in the classes, which we did by using quizzes. These comprised short questions in which students were asked to answer simple problems using the techniques available in the relevant historical period.

The number of students taking the course increased steadily. We were attracting students from several degree programmes, including Actuarial Science, Business Mathematics and Statistics, and the General Course (the LSE name for overseas students who spend one year here), as well as Mathematics with Economics. This meant that we needed some additional help with the teaching, and we were lucky enough to recruit June Barrow-Green, one of the UK’s leading historians of mathematics. It was also time to make the course materials more attractive, by collecting them in five illustrated booklets which were distributed to registered students in advance. The covers of two of these booklets are shown below. Some additional materials are made available on Moodle.

We were aware that the British Society for the History of Mathematics offered annual prizes for essays written by undergraduates at a UK university, and in 2015-16 we encouraged students to submit their MA318 essays for this prize. It was gratifying that one prize was awarded to Michael Seal, one of our Mathematics with Economics students, for his essay entitled ‘Was there a revolution in analysis in the nineteenth century’. Clearly the course is in good shape, and we are looking forward to teaching it again next year.

MA318 offers mathematics students the rare opportunity to write a marked piece of work. Last year, one student submitted their essay to The British Society for the History of Mathematics (BSHM) Undergraduate Essay Prize, awarded annually for an essay by an undergraduate student on any topic in the history of mathematics. One of 2016’s two prizes was awarded to Michael Seal (London School of Economics, Department of Mathematics, BSc Mathematics with Economics) for his essay entitled “Was there a Revolution in Analysis in the Early 19th Century?”

We spoke to Michael about his success and he had this to say about his time at LSE and how it now influences his role a secondary school maths teacher:

“The first thing I should say is what a pleasant surprise it was to be awarded the prize! It is lovely to have my work recognised, and exciting to be able to engage further with those at the BSHM – I think the essay prize is a great opportunity, and I’m very grateful to the BSHM for presenting me with it!

The essay itself asked whether there was a revolution in Analysis in the early 19th century, which is interesting because of the breakthroughs in Calculus that preceded the period, and the ensuing interplay between the march of mathematics and science, and the political and religious institutions of the time. My argument was 3-pronged and concluded that there was a revolution in Analysis at that time. I concluded that it had 3 defining characteristics: the paradigm shift in the collective attitude towards rigour at the beginning of the 19th century, which motivated a huge increase in the organisation of, and research into, the concepts of Analysis, which in turn has had an unprecedented impact on our current understanding of the subject.

Finally I wanted to say how much I enjoyed the course! It was delivered enthusiastically and engagingly, and was a highlight of my final year! My time at LSE was marked by great people and experiences throughout, in particular, it was of course dominated by a series of fascinating (and deeply challenging!) maths courses. What I realised during the MA318 course, and have come to appreciate even more deeply in retrospect, is what really grabs me about the History of Maths: that it ties together the different areas of the subject into a cohesive big picture. Not only that, but it provides a spectacularly detailed narrative for that how that ‘big picture’ emerged.

I am now teaching Maths full-time at a secondary school in South London, and my lessons are packed with historical context: stories of where the Maths came from, and how it ties in with everything the period – from the body of academic knowledge at the time, to the social and political situation! I am grateful to Professors Biggs, Wilson, and Barrow-Green for an inspiring and informative experience, that has enriched my own teaching, and has opened me to an entirely new dimension of our subject!”

]]>A brief trip to London for the two colloquia; what follows are random musings on what we heard in a very enjoyable two days.

Three speakers mentioned Galton–Watson branching processes in similar contexts: **Andrew Treglown** (Birmingham), **Oliver Riordan** (Oxford) and **Guillem Perarnau** (Birmingham). I got two insights from this, which I had probably not realised clearly before. First, Erdős–Rényi random graph is the same thing as percolation on the complete graph. (For percolation on a graph, we allow edges to be open independently with probability *p* and ask questions about whether the resulting graph has a “giant component” and similar things. Erdős and Rényi simply included edges with probability *p* from all 2-subsets of the vertex set, and asked similar questions.)

Secondly, for percolation in any regular graph of valency *d*, to explore the component containing *v* you look at the neighbours of *v*, their neighbours, and so on until you have seen everything. The number of neighbours of any vertex (apart from the vertex on the path from *v* by which we reached it) is a binomial random variable Bin(*d*−1,*p*), and the whole process is “stochastically dominated” by the Galton–Watson process with this rule for descendents. (In an infinite tree, they would be identical; but the possibility of short cycles means that we will not see more vertices in the percolation case than in the Galton–Watson case.)

The three speakers tackled different problems. Perarnau was looking through the critical window in the percolation problem for regular graphs. Treglown was examining resilience of random regular graphs, e.g. how many edges can you delete before losing the typical value of a Ramsey number in the graph. (As he said, there are two processes going on here: first choose a random graph, then delete an arbitrary subset of its edges.) Riordan was considering the Achlioptas process, a variant on Erdős–Rényi where, at each step, you choose two random edges, and keep one, the choice depending on the size of the components they connect.

**Kitty Meeks** (Glasgow) talked about complexity problems for sum-free sets. How do such problems arise? She began with a theorem of Erdős, according to which any set of *n* natural numbers contains a sum-free subset of size at least *n*/3. Now one can ask whether there is a sum-free subset of size *n*/3+*k*, or to count such subsets. This kind of problem lends itself very well to analysis via *parameterised complexity*.

**Sophie Huczynska** (St. Andrews) talked about the “homomorphic image” order on relational structures such as graphs: in this order, *G* is smaller than *H* if there is a homomorphism from *G* to *H* mapping the edge set of *G* onto the edge set of *H*. (She began by describing a variety of graph orders, such as subgraph order, induced subgraph order, and minor order, and showing how all can be expressed in terms of homomorphisms.) Typical questions are whether a class of graphs or other structures is a *partial well-order* with respect to the homomorphic image order (that is, whether it contains no infinite antichains), and describing the antichains if possible.

A special mention for **Ewan Davies** (LSE), who stepped in at very short notice when Agelos Georgakopoulos (Warwick) was unable to speak. (We found out the reason later; his wife had produced their first son that very day.) Ewan gave a fascinating talk on getting accurate values or estimates for the coefficients of the matching polynomial of a graph (the polynomial in which the coefficient of *x ^{k}* is the number of matchings with

*This post was originally published on Peter Cameron’s Blog*

While much of the rest of London was busy Christmas shopping, around 25 mathematicians (some from as far away as Brazil, the US and Israel) spent the week of 12-19 December 2016 on what mathematicians are accustomed to do: thinking – or rather thinking and discussing. Only this time the locus of this thinking was the side-wing of the British Library that is the new and shiny Alan Turing Institute. And the mathematicians met for a workshop on “Large-scale structures in random graphs”, organised by three LSE academics (Peter Allen, Julia Böttcher and Jozef Skokan), and funded jointly by the Heilbronn Institute and the Alan Turing Institute.

Well, in the mornings there were lectures; and there was coffee, of course (the great Hungarian mathematician Paul Erdős famously said that “Mathematicians are machines for turning coffee into theorems”). The afternoons were spent discussing in small groups, scribbling formulas and sketching pictures on boards, walls, and meeting room glass panes. (Yes, the Alan Turing Institute has white-board walls and special markers for writing on glass. Coffee, by the way, comes out of what looks like an ordinary tap, operated via a tablet.)

It was an intense and highly successful workshop, pushing further the boundaries of what we currently know in the area. And it concerned an important topic: the study of how certain types of large networks form and what we can mathematically say about their global structure, albeit having only very limited local information. In the current age, where a vast amount of data is shared on large networks (such as the internet or Facebook), it is vital to acknowledge that, unsatisfactorily, there are still many features of even simplified mathematical models underlying such networks that we do not adequately understand.

We will now try to give you some insight into this. We should start by explaining some of the words appearing in the title of the workshop.

The word ‘graph’ means two things in mathematics. One is the familiar x-y plot which shows us, for example, how much more expensive Euros have become since Brexit. The other, which is an abstract representation of two-body interactions, is what this workshop is about (see Figure 1). A graph has vertices (representing the interacting bodies), some pairs joined by edges (representing an interaction) and others not. Think of a social network in which people (vertices) may be friends (an edge is present) or not; or a road network where places are connected by roads (and in this example some edges, the one-way streets, come with a direction (see Figure 2)). We usually think of graphs visually, drawing points for the vertices and lines between two points for the edges.

A misnomer, really: it’s not ‘a graph’. What we mean is a way of generating graphs which uses some randomness. Most simply, we decide on the number n of vertices we will have, and then for each pair of vertices we flip a coin to decide whether to put an edge in or not (so we will do a lot of flipping coins). In this case, each pair of vertices will turn out to be an edge with probability ½, independently. There are (many) other options. We could use a biased coin, which gives us an edge with some fixed probability p not necessarily equal to ½. This method of generating random graphs is called G(n,p), and it is the method most time was spent on, both in mathematics in general and at the workshop. There are many other methods one could choose, either because they are supposed to model some real-world phenomenon or because they are mathematically interesting, but even after 60 years of research there are still many things about the simple G(n,p) model that we do not understand.

The kind of thing we end up writing formally are sentences like ‘With high probability a random graph generated according to G(n,p) is connected’. The word ‘connected’ means what it sounds like: you can get from any one vertex to any other by following edges. And ‘with high probability’ means that, as the number n of vertices gets large, the probability gets closer and closer to one. Informally, we would say ‘A typical random graph G(n,p) is connected’; if you actually follow the random method G(n,p) of generating a graph, the graph you end up with is very likely to be connected.

To get a feel for how things work in this area, let’s try to explain why this statement is true, for a fair coin (p=1/2). We’ll try to argue that not only is there a way to get from any x to any other y, actually we can even always do it using exactly two edges. We start off with our n vertices. Let’s call two of them x and y. We want to know, to start with, how likely it is that x and y are connected using exactly two edges when we generate G(n,p). In other words, we want to know how likely it is that there is some z such that xz and zy are both edges (see Figure 3). Apart from x and y, there are n-2 other vertices, so there are n-2 choices for z.

If we fix any given one of these, the chance that xz and yz both appear in G(n,1/2) is 1/4 – the chance of getting heads twice when we flip two coins, because that’s exactly what we do to decide whether these two pairs are edges. But if we look at all of the n-2 choices of z, there is that 1/4 chance each time; and the pair x,y only needs to get lucky once. The probability they don’t get lucky is (3/4)^(n-2). To fix an example, if n is 8,000, that is even more unlikely than winning the Lottery a hundred and fifty times in succession – we should be pretty confident that x and y turn out to be connected.

We want to know the whole graph is connected – ideally, we hope that every pair of vertices wins this little game (and let’s forget about all the other ways they might get connected; taking them into account can only help us find connections). There are lots of pairs of vertices – roughly n^2/2. It’s easy to check that n^2/2 grows much more slowly than (3/4)^(n-2) shrinks, so on balance, you should be amazed if even one single pair of vertices loses its little game when n is large. Returning to the n=8,000 example, in this case n^2/2 is about 32 million; about as many people as play the Lottery each time. No-one has ever even won the Lottery twice on consecutive draws, let alone a hundred and fifty times – so you should believe, at least for n=8,000, that it’s clear G(n,1/2) is very likely to be connected.

There is a subtlety here: these little games are not independent; every pair x,y is playing as part of the one big game of constructing G(n,p), just as when many people play the Lottery, they all get results based on one set of balls drawn. When people play the Lottery, they could all choose the same numbers – then there would be a very low chance of anyone winning. Or they could all choose different numbers – this makes it as likely as possible that someone will win. With 32 million people, they can actually choose all the possibilities and be sure of someone winning. If we knew how all the players in the Lottery chose numbers, we could calculate the chance that there will be a winner – the answer is somewhere between 1 in about 14 million (if everyone picked the same numbers) and certain (if all the players conspired to all choose different numbers). Similarly, it’s easy to find out how likely it is that a given pair x,y loses the little game and turns out not to be connected by a two-edge path (not very!) but what we want to know is the chance that any one of all the about n^2/2 pairs loses. Just like with the Lottery, that could be anywhere from (3/4)^(n-2) up to (3/4)^(n-2).n^2/2, depending on how the little games overlap. The right answer is somewhere between these extremes, but it gets hard to calculate. If I tell you that there is a connection from x to y, you should be more confident there is one from x to some other vertex w as well, for example. But in this example, it’s enough to know that the probability of losing in some little game can’t be bigger than (3/4)^(n-2).n^2/2 .

This example isn’t hard, but the reason is that we could get away with being lazy doing calculations. We wanted to show that typically G(n,p) is connected; we can get from any one vertex to any other. It turned out to be enough to look only at connections using exactly two edges (not one, not three, not more) to convince ourselves that this is likely, which made our lives easier. And it turned out to be enough to pretend the little games of connecting each pair x,y were conspiring to make our lives hard – a bit like the players of the Lottery conspiring to all choose different numbers so that there is a winner for certain – even though we know that’s not really the case. We were playing a game with the odds stacked heavily in our favour. So let’s make it harder, and consider G(n,p) with some p much less than 1/2. Now we flip a biased coin to see whether each edge appears; we expect to get far fewer edges, and of course if we have less edges then (intuitively) it should be less likely we will get a connected graph. What we would really like to know is the ‘threshold’: how small can we make p be and still be fairly confident of getting a connected graph (with say probability 99%)?

This is too hard to answer properly in a blog post. In fact, it’s getting closer to some open problems in the area – one of our speakers, Asaf Ferber, talked about some exciting new progress.

You could now skip to the next section – but if you want a vague idea of what the answer to this ‘threshold’ problem is, here is a vague explanation. One thing we can say is, if we want to be confident that G(n,p) is connected, we should certainly be confident that every vertex is in at least one edge: if a vertex isn’t in any edges, we can’t go anywhere from it and we certainly do not have a connected graph. It turns out this gives the answer: if p is large enough that with 99.5% probability every vertex is in at least one edge, then also with 99% probability the whole graph G(n,p) will be connected (when n is large, anyway). To see why this is true, you could try to re-do the explanation above, but consider connecting paths of all lengths (not just length two) between each x and y. Working out how likely it is that even one pair x,y wins its little game gets hard – but if you make life easier by throwing away possibilities as we did above, then what happens is that you probably threw away all the paths connecting x and y, so you don’t see that you won (like turning off the television half-way through the Lottery draw). Then you would have to put all these little games together, and this time you would have to work out how much they overlap – the worst case we used above, pretending that they were conspiring against us, won’t work this time; you would end up thinking you are very unlikely to get a connected graph, even though that’s not the case. It turns out these little games do conspire a bit, but not much. Actually, trying to do this whole calculation gets too complicated for anyone to solve. We do know the answer, but we use a different route to get to it – roughly, rather than trying to find paths connecting each pair of vertices, we start by arguing that if the graph is not connected, then that means it splits into two parts with no roads between – like the Irish Sea between the street-maps of Britain and Ireland – and show that oceans don’t tend to show up in random graphs.

More or less, this is where all the difficulty is in studying random graphs. Even though the edges are independent – whether one pair turns out to be an edge doesn’t affect another, because one coin toss doesn’t affect another – the properties we want to know about tend to involve playing a lot of little games, each one of which we will easily win but which do depend on each other in some complicated way. It’s hard to figure out if we’re likely to win the big game by winning all the little ones.

We want to know about properties for which you have to look at the whole random graph (or at least most of it), like ‘being connected’. If you only look at part of the random graph, you don’t know whether it’s connected. The other extreme is asking a question like ‘Are there four vertices with all six possible edges present?’ – if the answer is yes, I can convince you just by showing you the right four vertices. Usually we can answer questions of this type quite well.

Another example is that we might want to know whether a travelling salesman can go round all the vertices of the graph, following edges, until they return to their start without ever having to revisit vertices on their way – this is called a Hamilton cycle.

Again, it’s not too hard to show that in G(n,1/2) the answer is yes; again, if we have less edges – if we look at G(n,p) for some small p – it obviously gets less likely. As with connectivity, obviously we need every vertex to be in at least two edges for this (otherwise the salesman goes in to a vertex that’s in only one edge and can’t go out again), and it turns out that if p is large enough that with probability 99.5% every vertex is in at least two edges, then with probability at least 99% we will have the Hamilton cycle.

From the examples above, it sounds like we know a lot about how big p should be in order that we are confident that a specific structure shows up in G(n,p), like a Hamilton cycle. Actually, that’s not the case; for a lot of structures, not much more complicated than a Hamilton cycle, we don’t really know. We spent quite a bit of time working on a particular class of structures, ‘bounded degree graphs’, where we think we know what the answer could be. Some of the participants brought partial answers to the workshop, and putting them together it looks possible that a solution will come out.

What if the Highways Agency try to obstruct the travelling salesman by doing roadworks to cut him off? If they dig up too many roads leaving one town, the residents will complain to their MP – so the agency won’t do this. Can they still block the salesman? This sort of question is called a resilience problem. We know only a couple of techniques to work on this kind of problem, and a lot of the time they don’t work. We tried one of the simplest problems where the known techniques don’t seem to work, ‘tight Hamilton cycles in hypergraphs’. Some progress was made, but we did not yet get to a solution.

What about atypical random graphs? We expect about n^2/4 edges to show up in G(n,1/2); if there are many less, or many more, than that, then something weird happened. What does the result look like, and how likely is the weird event? In this case, the result looks like you accidentally used a biased coin: it looks like G(n,p) for some p not equal to 1/2, and we can calculate the chance of that happening easily. What about if we counted triangles instead? In this case, if there are too few triangles again G(n,1/2) probably looks like a typical G(n,p) for some p<1/2. If there are too many triangles, though, something else can happen: probably there are too many triangles because a little set of vertices gets far more edges than we expect, while the rest of G(n,1/2) still looks like a typical G(n,1/2). We don’t understand this sort of problem very well in general. Simon Griffiths and Yufei Zhao both talked about these problems, and two groups worked on different aspects. One group got results on their problem, the other made progress but still pieces are missing.

In another direction, Ramsey’s theorem says: for any graph H, there is a graph G such that however you colour the edges of G with red and blue, you will find a copy of H (plus maybe some extra edges) with either all edges red or all edges blue. Think for example of a path with n edges; there is a graph G which, however you colour the edges, will contain either a red or a blue path with n edges. How many edges does G need to have? It’s obvious that to get an n-edge path you need at least n edges in G; what number is enough? It turns out 100n will do. This problem is the size Ramsey problem; it doesn’t obviously have anything to do with random graphs. But there is a surprising connection: for many graphs (such as the n-edge path), the best graph G we know works is a typical random graph. But again we usually don’t know any very good answer; the path is one of the few graphs where we ‘only’ have a constant-factor gap between the lower and upper bounds. David Conlon‘s talk was about how to do better with proving that a random graph with few edges works in general, and the speaker set as a problem to show that path-like graphs – distance powers of paths – also have size Ramsey number growing linearly in n. Here it’s clear that a random graph on its own does not work: but the group working on this problem came up with a nice construction that modifies a random graph which they could show works, solving the problem.

We had several more talks and groups, which made good progress on more problems in this sort of spirit. But this piece is long and technical enough by now – to wrap up, we had a good week with excellent talks and productive sessions of group work; some of the results will be written up directly, while other groups will continue collaboration over the next months and years.

]]>

**What piqued your interest in the topic of contest theory?
**I got really interested in this topic around the time when crowdsourcing services started to gain a wider popularity in the internet online services. For instance, online labour platforms that use rank-based compensation schemes such as TopCoder; a popular online platform for competition-based software development. I was driven by intellectual curiosity to try to understand how online workers choose which projects to work on and how much effort to invest in their chosen projects. This decision-making is based on available information such as how one’s skills match different project requirements and stand in comparison with other online workers. I appreciated that game theory would provide me with a good framework to model rational user behavior using a set of first principles and to study the properties of outcomes in strategic equilibrium. I was also driven by a need for a unified treatment of various statistical methods for inference of individual skills based on observed contest outcomes.

**What kind of readers would benefit from Contest Theory and what is the key message you hope readers will take away?**

I think that the book is unique in providing a comprehensive coverage of contest theory developed in the areas of economics, computer science and statistics. This includes different kinds of strategic games that model contests and their equilibrium outcomes, statistical models of contest outcomes, and methods for inference of individual skills. The kind of readers who would benefit from the book include those who are interested in learning underlying mathematical foundations, for example in order to pursue further research in the area, and those who are more interested to learn various models of contests to apply them on data. I hope that readers would appreciate that contest theory contains rather elegant and insightful theory, which offers plenty of hypotheses about user behavior waiting to be tested on data.

**What application scenarios can be studied using contest theory?**

There are numerous traditional application scenarios that have been studied in economic literature, such as rent-seeking and promotion tournaments in firms. A classic example of a rent-seeking activity is political lobbying where companies use their capital to contribute to politicians who influence the laws and regulations that govern industry and how government subsidies are distributed within the industry. For example, presidential campaign donations, like in the recent US presidential election, may well have an underlying rent-seeking objective. In firms, rank-based compensation schemes have been widely used in different industries. These schemes sometimes create incentives that are misaligned with a firm’s objective; for example, for a number of years Microsoft used a stack ranking system for employee performance review, but eventually abandoned it to promote the teamwork culture and focus on the customers.

There are also numerous application scenarios in the context of various resource allocation problems and internet online services. For example, contest theory has been used to study allocation of computer and communication network resources, such as processor duty cycles, storage, and network bandwidth. It has also been widely used in the context of internet online services, such as online labour platforms and online Q&A services, which incentivise user contributions by monetary rewards or some sort of credits based on relative individual performance.

**Can you touch on how your experiences working at Microsoft Research helped to shape the contents of this book?**

It helped me to develop an appreciation of formulating and studying mathematical models that are motivated by real-world applications and trying to derive practical implications from theoretical results. It gave me an opportunity to work on several projects related to the topic of the book, in collaboration with some truly brilliant colleagues and students. For example, one of the projects resulted in one of the first models of crowdsourcing services as a system of simultaneous all-pay auctions. I have also been inspired by some prominent work of my colleagues on the design of skill rating systems for online gaming systems.

**How does your current position of professor in data science relate to the topic of the book?**

Data science is an interdisciplinary field that includes a wide range of scientific methods for learning from data. Contest theory can be seen as a subset of these scientific methods, including game theory as a mathematical framework to reason about strategic data sources and statistical models and methods for ranking data. I am in the fortunate situation of being affiliated with an institution for which understanding user behavior and “To Know the Causes of Things” is very much at the centre of their focus.

To find out more about *Contest Theory*, visit http://bit.ly/contesttheory.

**How would you describe your research interests to a non-mathematician?**

I usually say that I work in discrete mathematics, and that this is a subject that supports computer science in a way that is somewhat similar to how calculus supports physics. This usually suffices, but sometimes it leads to a discussion of continuous motion as opposed to the series of snap shots that make up a movie. With the more curious, I may talk about counting, probability, and modeling with graph theory.

**Consider me a more curious person; what examples of modelling in graph theory do you look at in your research?**

First I would preface the discussion with two examples:

- Mathematicians have been studying prime numbers for centuries, going back to the ancient Greeks. With the assistance of record keeping their understanding became both much more sophisticated and much more widely disseminated over time. This work was a huge intellectual accomplishment with no conceivable functional application until the development of computers and the need for encryption. Today the theory of prime numbers is the backbone of encryption, and as such, is fundamentally important to society.
- Another example is the probabilistic method introduced by Paul Erdős to attack existence problems for various discrete structures. These problems were purely intellectual questions; there was no intention to apply results in any practical way. The technique was so effective that it quickly spread among mathematicians as it was used to answer more and more questions. It was not the individual answers that where of major importance, but the development and spread of the technique and related ideas. Soon it spread to theoretical computer science, and then to the practical algorithms used in today’s software.

With this background I will return to your question, and demonstrate this process operating on a much smaller scale. In graduate school I was working in logic on questions about what functions on infinite partial orderings can be calculated by an algorithm. The algorithm did not have to be fast, or polynomial, or even exponential; it only had to eventually give the correct answer. Later, while working in my first job I reformulated the underlying question in terms of online algorithms for finite partial orderings, and with Tom Trotter gave an exact answer for the special case of interval orders. Computer scientists use interval orders to model the problem of using as few storage locations as possible to store variables while running a program. Their techniques give optimal answers if all variables take the same space, but it is an NP-complete problem when the variables have different sizes. By using my work on First-Fit and online algorithms I was able to give the first two polynomial-time linear approximation algorithms for this problem.

**If you hadn’t seen these applications of your research, would you be disappointed by that or would you have changed your area of research?**

I investigated interval orders because I was aware there could be applications, so if I had not found them I would have been disappointed, but it would not have effected the main line of my research.

**If a fledgling mathematics student (at any level, Undergraduate to PhD) asks you for advice on whether they should do more applied or more theoretical courses or research, how would you guide them?**

There are several considerations:

- I think a strong undergraduate student who wants to go to graduate school in applied math should concentrate on proof driven theory courses. There is plenty of time to learn applications later. That said, it is worthwhile to investigate what areas of applied mathematics fit the student’s interests.
- Weak to average undergraduates gain little from proof driven theory courses, but seem to do well when they enter the job market.
- My brother and I both got PhDs in logic in 1979. It was a time when academic jobs were very tight. I took the academic route whilst my brother took the industry route. We both had very satisfying careers. I had more freedom to do what I wanted, and what I found interesting (and important); he made twice as much money. My brother said that the most important thing for him was to learn as many first year graduate courses as possible. I often tell this story to students.

I think mathematics students have many choices for careers, and my role should be to inform, not advise.

**What other interests do you have? How do you “switch off” from mathematics (assuming you do, now and then)?**

I like to exercise, especially hiking and biking. I follow soccer (football) closely, especially the EPL, and watch games while peddling my stationary bike. I do not play anymore, but I can feel like I played by the end of the match! I like to relax before dinner with a beer, sitting on my back deck looking at the hills, and reading politics or light novels.

**That’s a fantastic view in the photograph you sent us. What city is in the background?**

Mostly Phoenix, Tempe is to the right.

**What is an online algorithm?**

An algorithm is a set of instructions to process the input for a given problem. One classic analogy is to see it as a cooking recipe: the algorithm is the cooking instructions, the input is the ingredients and the resulting dish is the output. In the classical setting, algorithms have access to the entire input and the algorithm is a function applied to this input. The result of the function being the output. In contrast, in the *online* setting, the input is revealed sequentially, piece by piece; these pieces are called requests. Moreover, after receiving each request, the algorithms must take an action before the next request is revealed. That is, the algorithm must make irrevocable decisions based on the input revealed so far without any knowledge of the future input. In other words, at each request, a function is applied to the revealed input. Since the future is unknown, these decisions could prove very costly.

Online problems have many real-world applications, where the input is sequential and each request demands an immediate action. One such example is the problem of managing the fast memory in computers. This is called the *paging problem* and it consists of: a universe of pages (the data in the slow memory) and a cache of fast memory that can hold pages. The input is a sequence of requests for pages in . A page fault occurs when a requested page is not in the cache; in such a case, a page in the cache must be replaced by the requested page. The goal is to minimize the number of page faults over the request sequence.

**What is competitive analysis of online algorithms?**

The standard technique for evaluating the performance of online algorithms is called *competitive analysis* and it was developed with the aim of understanding the performance of an algorithm with no knowledge of the future against an algorithm with full knowledge of the future. This is done by determining the maximum ratio, called the *competitive ratio*, of the value of the solution produced by the online algorithm as compared to the value of the optimal solution for any fixed-length request sequence. This ratio can be seen as the price for not knowing the future. It is a simple and effective framework for evaluating the performance of online algorithms and has been instrumental in making online computing a well-established field of theoretical computer science.

Even though competitive analysis is the yardstick for analyzing online algorithms, it sometimes fails to distinguish between algorithms for which experimental evidence suggests a significant difference in terms of performance. This was even noted in the seminal work of Sleator and Tarjan. The classic example is the paging problem. Competitive analysis says that the strategies LEAST RECENTLY USED (LRU) (on a page fault, replace the least recently used page) and FLUSH WHEN FULL (FWF) (on a page fault, remove all the pages from the cache) have the same competitive ratio, whereas, in practice, LRU is much better than FWF. Such disconnects between theory and practice have motivated the study of alternatives to the competitive ratio such as the *bijective ratio* (or more generally *approximate stochastic dominance*), the topic of my talk.

**What is the bijective ratio and how does it enrich the analysis of online algorithms?**

Consider two lotteries and . Say that the probability to win £ is higher in than in for any value . That is, you are more likely to win $10, $100 or $1,000,000 in than you are in Obviously, you would prefer to buy a ticket for than (maximizing profit), and, from the lottery perspective, they would prefer you buy a ticket in (minimizing cost). This notion is called* stochastic dominance*. That is, stochastically dominates . This technique is often used in decision theory and microeconomics, and it has been used for evaluating the performance of online algorithms, but less so than competitive analysis. When analyzing online algorithms using stochastic dominance, if the requests are drawn from a uniform distribution, we have *bijective analysis*. This stipulates a very stringent relationship between the compared algorithms. Suppose now that and are algorithms, and we are interested in minimizing the cost over a request sequence. We say that is *bijectively better* than if stochastically dominates when the requests are sampled according to a uniform distribution. This relation not only implies that the average cost of is no more than the average cost of , but that has at least as many request sequences as that engender a cost below any threshold . In previous works, these techniques have been able to provide a clear separation between algorithms where competitive analysis did not; the paging problem being the most notable. However, there are situations in which it may be too difficult to establish this relationship analytically, or it may not even exist.

For this reason, we introduce the *bijective ratio*, where we extend bijective analysis in the spirit of the competitive and approximation ratios. That is, we introduce a -approximate stochastic dominance. Going back to the lottery example, this would be that you are more likely to win £ in than £() in . This notion maintains the essential aspect of bijective analysis in that the average cost of is no more than times the average cost of and that has at least as many request sequences that engender a cost below as has request sequences that engender a cost below for any . Moreover, this relaxation makes these analysis techniques amenable to all online problems, and they can be used both to compare two online algorithms, or to compare an online algorithm to the optimal offline algorithm.

**What other areas of research are you interested in?**

In general, I am interested in the design and analysis of algorithms. In particular, I tend to be interested in computational models that move away from the conventional input- calculation-output scheme. Recently, I’ve been working on searching and patrolling games, specifically, ones that are inspired by biological systems.

**Paris must be an exciting place to live. How do you spend your time when you’re not working?**

Outside of work I’m busy with family life; my wife and I have two boys, aged 8 and almost 5. I have a passion for cooking, and Paris is a perfect place for this hobby with ready access to great ingredients, amazing cheeses and wines. And, thanks to a great children’s cookbook suggestion of Bernhard von Stengel during my visit to the LSE, my boys are joining me in the kitchen.