Dear students, please use the videos below if you missed a class. Please note, however, that these are, in essence, my “preparation notes”, and are not designed to completely substitute tutorials. I’m using pencasts as a motivation to work through the problems (a credible threat of a sort).
Solutions by Jonathon Livermore
Tutorial 1. Q1 Q2 (notes)
Tutorial 2. Q1 Q2 Q3 Q4
Tutorial 3. Q1 Q2 Q3 Q4 (notes)
Tutorial 4. Q1 Q2 Q3 Q4 (notes)
Tutorial 5. Q1 Q2 Q3 (notes)
Tutorial 6. Q1 Q2 Q3 Q4 (notes)
Tutorial 7. Q1 Q2 (notes)
Tutorial 8. Q1 Q2
Tutorial 9. Q1 Q2 Q3
Tutorial 10. Q1 Q2 Q3
Tutorial 11. Q1
Students’ reviews (1, 2, 3, 4)
A known relationship that is usually given axiomatically:
Upon rearrangement gives the multiplication rule of probability:
Now observe a cool set up that is handy to keep in mind for proving the law of total probability and Bayes’ theorem.
Imagine that happens with one and only one of mutually exclusive events , i.e.:
By addition rule:
Now by multiplication rule:
This is the law of total probability
From the same set up imagine that we want to find the probability of even if is known to have happened. By the multiplication rule:
By neglecting and dividing the rest through we get:
And applying the law of total probability to the bottom we have the Bayes’ equation
Bunch of examples:
Problem: is a known probability of receiving phone calls during time interval . Also . Assuming that a number of received calls during two adjeicent time periods are independent find the probability of receiving calls for the time interval that equal .
Solution: Let be an event consisted of call in the interval till . Then clearly
which means that the event can be seen as sum of mutually exclusive events, such that in the first interval of duration number of calls received is and in the second interval of the same duration number of received calls is (). By rule of addition
By the rule of multiplication
If we change the notation so that
It is known that under quite general conditions
(Recall that the Poisson distribution is an appropriate model if the following assumptions are true. (a) is the number of times an event occurs in an interval and can take values . (b) The occurrence of one event does not affect the probability that a second event will occur. That is, events occur independently. (c) The rate at which events occur is constant. The rate cannot be higher in some intervals and lower in other intervals (that kinda a lot to take on faith really). (d) Two events cannot occur at exactly the same instant; instead, at each very small sub-interval exactly one event either occurs or does not occur. (e) The probability of an event in a small sub-interval is proportional to the length of the sub-interval. Or instead of those assumptions, the actual probability distribution is given by a binomial distribution and the number of trials is sufficiently bigger than the number of successes one is asking about (binomial distribution approaches Poisson).)
Parametrisation then gives
The key point is that if for time interval we have that parametrized formula for we have the one above. It holds true for any multiples of as well.
Out of elementary events one can get
possible outcomes. Where is an event that contains elementary events. Take set
with the size as the only characteristic . Then it power set
contains elements. event for one element each, . Then events with two element, . Finally, event for one with all elements, . A emply set is an impossible event.
I personally think that this simple fact is amazing, but some would say it is kinda boring. Here is an interesting question for those.
A pack of cards that has cards is randomly split equally into halves. What is the probability that halves have equal amount black and red cards?
This is just another set with elements of two type.
The denominator indicates all possible equally likely ways the pack can be split.
Instead of computing that manually one can use this asymptotic equality
Simple algebra yields
The result fascinates me. The graph visualizes data from a real experiment where a pack is split equally times and is a cumulated sum if exactly red cards are observed in on of the halves. What is crazy is that we were able to see the results of this experiments without doing any experiments, by simply reasoning mathematically about things.
More on this topic: Гнеденко-1988
Imagine a pile of rubble () where the separated elements of the pile are stones (). By picking stones we form a sample that we can sort by weight. A sequence becomes , where is called “rank”.
Pretend that we do the following. Apon picking a sample and sorting it we put stones into drawers and mark each drawer by rank. Now repeat the procedure again and again (picking a sample, sorting and putting stones into drawers). After several repetitions, we find out that drawer # contains the lightest stones, whereas drawer # the heaviest. An interesting observation is that by repeating the procedure indefinitely we would be able to put all parenting set (the whole pile or the whole range of parenting distribution) into drawers and later do the opposite — take all stones (from all drawers) mix them to get back the parenting set. (The fact that distributions (and moments) of stones of particular rank and the parenting distribution are related is probably the most thought-provoking)
Now let us consider the drawers. Obviously, the weight of stones in a given drawer (in a rank) is not the same. Furthermore, they are random and governed by some distribution. In other words, they are, in turn, a random variable, called order statistics. Let us label this random variable , where is a rank. Thus a sorted sample looks like this
Its elements (a set of elements (stones) from the general set (pile) with rank (drawer)) are called order statistics.
Elements and are called “extreme”. If is odd, a value with number is central. If is of order this statistics is called “ central” A curious question is how define “extreme” elements if . If increases, then increases as we.
Let us derive a density function of order statistics with the sample size of . Assume that parenting distribution and density are continues everywhere. We’ll be dealing with a random variable which share the same range as a parenting distribution (if a stone comes from the pile it won’t be bigger than the biggest stone in that pile).
The figure has and and the function of interest . Index indicates the size of the sample. The axis has values that belong to a particular realization of
The probability that m-order statistics is in the neuborhood of is by definition (recall identity: ):
We can express this probability in term of parenting distribution , thus relating and .
(This bit was a little tricky for me; read it twice with a nap in between) Consider that realization of is a trias (a sequence generated by parenting distribution, rather then the order statistics; remember that range is common) where “success” is when a value is observed, and “failure” is when (if still necessary return to a pile and stone metaphor). Obviously, the probability of success is , and of a failure is . The number of successes is equal to , failures is equal to , because value of in a sample of a size is such that values are less and values are higher than it.
Clearly, that the process of counting of successes has a binomial distribution. (recall that probability of getting exactly successes in trials is given by pms: In words, successes occur with and failures occur with probability . However, the successes can occur anywhere among the trials, and there are different ways of distributing successes in a sequence of trials. A little more about it)
The probability for the parenting distribution to take the value close to is an element of .
The probability of sample to be close to in such a way that elements are to the left of it and to the rights, and the random variable to be in the neighborgood of it is equal to:
Note that this is exactly , thus:
Furthermore if from switching from to we maintaine the scale of axis then
The expression shows that the density of order statistics depends on the parenting distribution, the rank and the samples size. Note the distribution of extreme values, when and
The maximum to the right element has the distribution and the minimumal . As an example observe order statistics for ranks with the sample size for uniform distribution on the interval . Applying the last formula with (and thus we get the density of the smallest element
the middle element
and the maximal
With full concordance with the intuition, the density of the middle value is symmetric in regard to the parenting distribution, whereas the density of extreme values is bounded by the range of the parenting distribution and increases to a corresponding bound.
Note another interesting property of order statistics. By summing densities and dividing the result over their number:
on the interval
The normolized sum of order statistics turned out to equla the parenting distribution . It means that parenting distibution is combination of order statistics . Just like above had been mentioned that after sorting the general set by ranks we could mix the sorting back together to get the general set.
It must be a good restaurant since the line is so long. Hm… you are likely just failed to update your beliefs in a rational way.
Imagine you are in a classroom and there is an urn with three balls in front of everyone. You don’t see the colour of balls, but you do know equally likely it could be majority blue (2 blue 1 red) or majority red (1 blue 2 red). Since you don’t know which urn exactly is there (true state of the world) you need some evidence before making a guess. Now every person in class one by one come and pick one ball from the urn and without showing it announces his choice. Believe it or not, but this is your restaurant choice situation.
Two possibilities for the urn is an analogue to whether this restaurant good or bad. A person that comes to make a choice has several pieces of information to combine. Taking one ball from urn is the same as if you have read some review about the restaurant before. The information is not perfect, the reviews could be biased or not representative for your taste. However, you also observed the choices of people before you. You do not know their private signal (what ball they picked from urn, i.e. what was their conclusion after studying the restaurant reviews), but you do know their choices.
Claiming that the restaurant must be good because the line is long would be true only if all people that come sequentially followed only their private signals. Then when your time has come to make a choice the line indicates independent draws of balls from the urn. If it the true state of the world was that the urn is majority blue you would have much more people that say so.
The thing is that those draws are clearly not independent. At some point, a person that has a private signal that states the urn is majority blue might see too many people choosing majority red and he will abandon his private signal and follow the crowd. So that when it is your turn to make a choice and you observe a line (i.e. heaps of people claiming their choice) it does not necessarily mean that the restaurant is good. Put differently, you do not account for correlation of public beliefs (a belief based on the observed choice before seeing your private signal) and private signals.
Well that is herding. And here is a presentation about it….
It is obviously not about restaurants at all, it could be a choice of major for a college degree. Is being a doctor a good choice or not? There is no way to know for sure, you just have to combine your private signal with the public belief. If you don’t have a strong private belief, then it will be overwhelmed by the public belief and you just follow the crowd. It also could explain why in Russia or Germany during good times aaalll people would put out Nazi flags outside or put Stalin’s portrait on the wall at home and office. Or pretty much anything that involves guessing the state of the world by combining information from your guess and choices of others.
Always start from the histogram, any non-parametric density estimation methods are essentially fancier versions of a histogram.
Compare the problem of choosing and optimal size of bins in histogram with choice of h in kernel estimator
The point of the exercise is to reveal all features of data; and that what important to keep in mind.
And now take a look at a perfect application of the idea in
Nissanov, Zoya, and Maria Grazia Pittau. “Measuring changes in the Russian middle class between 1992 and 2008: a nonparametric distributional analysis.” Empirical Economics 50.2 (2016): 503-530.
Going back to advice: keep in mind that you doing it to reveal features of data and it has to be strictly more informative than a histogram, otherwise the computational costs are not justified.
Check my presentation on an empirical model of firm entry with endogenous product-type choices. (here)
A normal reaction to the presentation’s topic should be “whaat? why would anyone want to do this stuff for a living?”. It is a great question, I don’t have an answer to it. It is indeed viciously technical and deadly boring.
But I do have something really cool to share. Back home I was driving my 15-year-old niece to a museum and failed to find a humanly understandable combination of words to explain what science is. So now you check this combination of words, I think it is a really cool fit….
A human eye is able to capture a quite limited portion of light wave spectrum (Visible spectrum). We are unable to travel in time or reach most of the planets in the galaxy. Yet there is no need to be able to physically see the whole light wave spectrum to actually “see” it. And you do not need to be able to travel in time to “see” the past, just like you do not need to be able to travel to another planet to “see” that planet. Here is a cool angel on it. An information integration theory of consciousness, an exceptionally creative idea that, if appreciated properly, will blow your mind.
Human bodies have an enormous amount of systems like no other living being. We feel temperature, objects, we see and hear, feel emotions like fear, shame, happiness etc.. Our brain integrates all of this information from all the systems into a sense of reality. Put differently the reality as seen by a person is but an aggregated sensations from a set of systems, which continuously register information. Think about a feeling of pain. Pain is your body’s language. If your body needs attention from you, it sends a signal. However, the signal has only one dimension, it is kind of like a baby cry. Baby can only change the intensity of a cry but it is your job to give to this cry an interpretation. Your brain does the same. (To be more precise you do it yourself but unconsciously, it is one of that automatic processe, kinda like intuition) A conscience, or a capacity to separate yourself from other things, is just another trick of your brain. Instead of giving you a row information from systems that systematically aggregate information it gives you interpretation. Instead of overwhelming you with tonnes of sensations brain gives you a meaning of them. The reality is a brain’s interpretation of the aggregation of information from a number of systems that supply raw data.
Holy bologna!! But is it not what science is? Yes, indeed. Science is nothing but a natural extension of a process that your body does almost automatically. Aggregating information from systems that continuously register information and assign meaning to them (there is also this thesis that mathematics is nothing but common sense, a quite dense at times. I’ll see if I can make this post compact and readable enough if I do I’ll give you that idea as well)
It is also interesting to look at people’s temperaments. The system integrator (our brain, our consciousness) assigns different weights to different system’s from which it gets information. That’s why sometimes we observe people who are always scared or calm, sympathetic or cold. Of course, there are other things that define character, or predilection to specific kinds of decisions, such as upbringing and genetics, yet the system integrator has the last word.
Ok. Your brain has the capacity to integrate information from systems that systematically aggregate information and assign meaning, one of a product of this process is a conscience or a sense of reality. But the systems do not have to physiological, they do not necessarily have to be attached to your brain through common nerve system. It just has to be something that contains information. Let’s go back to the very beginning of this post. Yes indeed people see a quite narrow spectrum of the lightwave, however, there are devices which can capture those waves. Cameras, for example, continuously aggregate information. It would never have been done if we limited ourselves to physiological systems. However, for your brain information which is captured by the camera will have the same value as the information captured by your eyes. The only difference is that your brain will have to readjust itself to be able to aggregate information from it. And that is why in the beginning when you look at some figure which contains information you will be confused but with time you have to realign the integration process. In other words, you have to be able to incorporate this new information and combine it with information from other systems. When you do mathematics it’s very important at some point to stop and think what is the meaning of the equations that you have. You have to integrate this information with other information that your brain has and assign meaning to it. That is, in fact, a process of co-integration of information from different sources. And it is very costly for your brain to do, that is why it is so annoying. Another example from the beginning is our incapacity to travel across time. Well, the physical world, unfortunately, has this dimension which only goes one way and the speed of this going can not normally be changed. But all of us has some videotapes from the past. Imagine that there is a probe that is able to capture some information from the past and keep it (picture, videotape, documentary movies). Some system even allows us to travel through time and for our brain this is identical to if we were to travel in past ourselves. You just have to put in some effort to integrate the information from new systems. People who study history or work on documentary movies emerge themselves with systems that continuously register information from the past and their brain is trained well enough to easily incorporate this knowledge and assign a meaning to it. Another example is that to get the information about faraway planets one does not have to physically travel there, astronomical spectroscopy allows to systematically capture the information about the planets and then you can realign this knowledge so that your brain would incorporate and integrate into a perception of reality just like it would do from your eyes. And the final example is a statistical work. So if you have some data sets you can do some statistics to make some conclusions. But most often to do some statistical work a person has to merge two data sets. If those two different data sets are nothing but systems that continuously capture the information about some object. Put differently there are two independent systems that continuously register information about some object (it is other people that put down a number, in theory instead of a number they could have used words, but then we are back to crying baby case, the signal is not rich enough). They look at the same place and what people can do the camp combine this knowledge to assign some meaning to eat.
The point is our brain is capable to aggregate information from many many systems that supply information than physiological limits dictate.
In some sense, our brain is a prisoner of our physiological systems. So one way to say is science is setting your brain free. Seeing and thinking are the same thing when your eyes are closed. Put different things that we physically see here or feel is just a little fraction of what we potentially can see if we allow our brain to aggregate information and assign meanings from much wider systems that continuously register information. The sense of reality, conscience, is a computational shortcut. Because otherwise your brain would be overwhelmed with information.
In fact, any meaning is a computational shortcut that only your brain requires. The objective reality exists as an enormous mostly meaningless set of data. Life exists only because it can, asking for the meaning of life is the most idiotic question of all. Meaning itself is senseless it is nothing but a trick of your brain to aggregate information easier (It sounds really weird… hm… I probably should wrap up with this one, better do another post).
P.S. To survive people developed a capacity to form groups very quickly (morality) and to make decisions in uncertainty very quickly. A sense of reality, or consciousness, is sort of a “sufficient statistics”. For the decision at hand (to survive) we can form one parameter, a meaning, that would contain all useful information from the data that surround us. It economized on computational requirements and minimizes the risk of a mistake (sometimes a cost of a mistake is your life)
To overcome its physical vulnerability ants developed a unique way to navigate in uncertainty. The chemical trace allows to an ant and all his bodies to find a way from food to home (it is very close to how people use market prices to send information. A pencil example by Friedman). Spiders have developed the web to catch insects the same size as spiders themselves (morality is an evolved part of human nature, much like a tendency to weave nets is an evolved part of spiders’ nature. See figures with “gossiping” here). What’s so special about humans? There no better way to demonstrate than with a movie Allied. What do you choose an allegiance to your family or your country? The choice evokes a range of thoughts, feelings, emotions, and intuitions about what to do, what is the right thing to do, what one ought to do—what is the moral thing to do. Nobody except humans possesses morality, but why over million years of evolution nature decided to develop such a peculiar attribute? Morality is what makes people come together and play non-zero-sum games, it was evolutionary necessitated device that ensured the survival. The feeling of “right” and “wrong”, “good” and “bad” is nothing but your brain figuring out how to act in groups and use groups to its advantage. (Next time when you go to the park and see many groups of people keep in mind that this is happening because an action of cooperating is remunerated with oxytocin (brain uses hormones like carrot and stick to incentive a particular form of behaviour, the one that proved to increase the chances of survival))
What are these moral thoughts and feelings, where do they come from, how do they work, and what are they for? There is a scientific answer to these questions. It is possible to use the mathematical theory of cooperation—the theory of nonzero-sum games—to transform this commonplace observation into a precise and comprehensive theory, capable of making specific testable predictions about the nature of morality. (Curry 2016)
A little experiment called Public Good Game (aka n-player prisoners dilemma; Imagine you have a baby and you and your partner have to do something very important for themselves so that each would like the other one to sit with the baby. But if both bail on sitting with baby… then we both suffer because the little one might fall, choke or something. It is individually rational to defect in providing the public good and “free-ride”. If there are many players – it makes it a Public Good Game) captured a feature that is unique to the animal world – “reciprocal altruism”. People trust to the strangers if they see that they are eager to cooperate. Only humans possess this.
This feature manifests itself in technologies of trust (exchange and reciprocity) such as money, written contracts, ‘mechanical cheater detectors’ such as ‘[c]ash register tapes, punch clocks, train tickets, receipts, accounting ledgers’, handcuffs, prisons, electric chairs, CCTV, branding of criminals, and criminal records. And this very feature allows humans to create social structures such as markets, political elections and …states. People had money, laws and elections way before political science and economics had anything to say about it. All these social structures, markets, elections and states themselves allow strangers – not genetically related species – to beneficially coexist.
Ok, but what is has to do with the key difference between developed and underdeveloped countries? Well, everything.
The developing countries are simply unable to form social structures effectively. They can not fairly elect political leaders, they can not maintain market economy without terrible abuses that potentially come with market economies. If people generally do not follow laws, a country practically does not have any laws. Financial technologies are a pure manifestation of “reciprocal altruism”, where the complexity and richness of financial instruments are based on nothing but a piece of paper that has power only if people trust it. The problem of developed countries is that people in these countries are unable to cooperate effectively. They are unable to play a zero-sum game. In the US strangers came together and created iPhone, in Russia, people fail to organise themselves into homeowner associations (another interesting example is how Russians treat national currency, everybody ditches it whenever the opportunity arises, that leads to volatility and self-fulfilled prophecy that currency had to be ditched). In general, the breakdown of cooperation in such games as Public Good Game or Minimum Effort Game are called coordination failure.
What is curious is that playing non-zero-sum games is a natural evolutionary developed tendency in any human. In the absence of interference, people will eventually form an effective cooperation. They will come up with the sets of rules and believes that will allow for an effective non-zero-sum game. My favourite example is a lovely place called Russia, where the government does practically everything possible to break down the effective cooperation by systematically taking actions that induce the negative beliefs.
Hm… I have started the post with morality. Morality is what makes you feel like punishing defectors in Public Good Game (you say “this is wrong”) and makes you contribute if everyone else contributes (you say “I feel bad by not doing the right thing”) or makes you feel offended if you contribute but most did not (You say “I feel like an idiot by doing this”). All people say these things in their head and that what makes them come together and do a great thing. Or, if you leave in some underdeveloped country, never do anything great.
P.S. Check this awesome quotation from here:
Cooperation depends on trust, which in turn requires evaluating individuals and groups as potential cooperation partners. Oxytocin, a neuropeptide known for its role in social attachment and affiliation in mammals appears to be important for both kinds of decisions. Intranasal administration of oxytocin increases investment in a “trust game”, but also biases judgment and behavior toward ingroup members and against outgroup members. Likewise, genetic variants associated with oxytocin are associated with increased prosocial behavior, particularly when the world is seen as threatening. From an evolutionary perspective, the double-edged sword of human morality comes as no surprise. Morality evolved, not as device for universal cooperation, but as a competitive weapon, as a system for turning Me into Us, which in turn enables Us to outcompete Them. Morality’s dark, tribalistic side is powerful, but there’s no reason why it must prevail. The flexible thinking enabled by our enlarged prefrontal cortices may enable us to retain the best of our moral impulses while transcending their inherent limitations.
A little report on a paper about a collision in the electricity market in the UK.
In the late 1990s, the combination of game theory and econometrics produced new techniques for collision detection. The advantage of this technique is that you just need readily available public data and few simple equations that reasonably captures firms behaviour.
The big picture is that if you know the costs of the firm you can already tell if the prices are way too high.
Papers are essentially identical. This new technique is used and then the results compared with more conventional methods, e.g. using cross market variations (by definition require way more data). The bottom line is that this technique works. Hurray.
… and a little aside as per usually. A market is only one case of a social structure where strangers interact, there are many others, e.g. elections, law enforcement. (these social structures are all trust-based technologies, trusting to a stranger, or a piece of paper, is a unique evolutionary feature that observed exclusively in people. It allows us to play non-zero-sum games (cooperate, build states and stuff) and kick butts even if we are physically weaker than most predators in the animal world) What’s nice about markets is that inhere things are sort of black and white, everyone knows what they are doing. Yet, almost any concept that has been designed to capture interaction in a market can be generalised to any other social structure. Few examples. An idea of being small so that you take environment as given, like in, you can’t do anything about it. When you vote for president your single vote is indeed very small to influence the outcome, when you vote within the local community, though, your vote matters a lot and environment is not exogenous at all. An idea of elasticity transcends directly to, for example, the relationship between men and women. If the market is inelastic then you can abuse it. Just like you can abuse a woman that doesn’t have anywhere to go (cool kinda related paper). Yet if there are many more “men” among which a “woman” can choose from then market becomes very elastic and one cannot abuse. Indeed, a lot that happens in the market can be generalised to any other social structure.
Blackandwhiteness of market comes from the fact that everybody kinda aware what game is being played. The problem with other social structures is that people don’t really know what game they really playing. (Crooked politicians, for example, will do everything they can to make sure that people are clueless what they are actually choosing among) Peoples’ minds have (another evolutionarily developed feature that allows people to form groups) morality that plays a very very important role in social structures, i.e. the notions of right and wrong and their boundaries (more technically they affect believes whether your “high” effort will be supported by other and not taken advantage of). Think about a country where it is customary for men to have way more rights at the expense of the rights of women, it would be very typical to see that women, in fact, are happy to give those rights to men, because they truly believe that their place is in the kitchen or at the lower paid job or something like that. Put differently, social norms very often prevent players from realising what game exactly is being played. Markets in this sense away less “contaminated” by those social norms, yet they are still very much affected. General notions of right and wrong play important roles in market, just like they do in any other social structures. (check this experiment that says that economists are more “rational” (read selfish)) American culture of winner takes it all leads to very aggressive corner solutions by the corporate world, naturally, to offset those the US has a very strong regulatory body.
Think Russians before the 1990s didn’t have any market experience. And when the markets were introduced after the 1990s rules were taken quite literally. Of course, there was a lot of influence of, so-called, market fundamentalists from IMF, which reinforced this idea that since this is capitalism and this is markets you can do everything which is not directly prohibited and even if this prohibited it is in the rules of the game to break the rules if you can.