Friday, 29 August 2008

Telling stories and lies with statistics

An article in the 30/8/08 New Scientist "Keep your head" discusses how our emotions override rational decision-making.

It is fine as far it goes, and the example of the vast increase in deaths as a consequence of people driving long distances instead of flying, due to fear caused by the terrorist attacks, is fine. But, the chart that it uses to support the argument - and the general story-line of "when it comes to risks, feel the numbers" - raises other problems.

Here is a reproduction of the chart:

OK, so flying is much safer than driving. But, look, driving is much safer than walking or cycling? Is that right? Can I really say 'driving is much safer than walking or cycling' based on this graph? I don't think so, and let me give three problems with drawing simplistic conclusions like that:

1) They are not simple alternatives. You can't choose between flying or walking to the local shop. If you need to do some food shopping you might choose between walking ten minutes to the corner shop, cycling ten minutes to the 'mini-market' or driving ten minutes to the supermarket. The comparison then is between deaths per hour, not deaths per kilometer. You could make a similar choice for you holiday destination: drive, train or bus to Blackpool, or fly to Spain. Again, the meaningful comparison is per hour, not per km.

2) These only include deaths to the passenger during the travel. What about killing other people? Cars kill other people - including cyclists and pedestrians, bicycles rarely do and pedestrians never do (not as a result of walking into people anyway - I presume!). And of course cycling and walking keeps you fit. It is said that the health benefits of cycling far outweigh the increased risk from accidents, compared to driving. I don't have the figures to confirm that, but it is a consideration.

3) What about wider systems issues - damage to the environment (global warming), impact on the economy. These are really difficult to evaluate, but it doesn't mean that don't exist.

To look at (1), I did a rough and ready conversion of the numbers from the article to 'deaths per million hours' based on a guess of the average speed for each mode of transport. (Motorcycle, car and bus 70 km/hour, walking 4 km/hour, bike 15 km/hour, van 60 km/hour, water 50 km/hour, rail 100 km/hour and air 800 km/hour. I also needed a figure for deaths per billion km for air travel, because the New Scientist article just said ‘less than 0.1'. A search on the web turned up a figure of 1 death per 15 billion passenger km, so I used that.) This gave the chart below (note that the column for motorcycle is shortened).

I admit I was disappointed to find that cycling still came out worse than driving, but the difference is now much smaller, and surely more than compensated by the health benefits of cycling. The thing that surprised me in this, was that flying came out worse than water, bus or rail. So a train trip to Blackpool looks like your best bet for the hols! (And why would anyone ever go anywhere on a motorbike!)

Wednesday, 27 August 2008

Intropy, and Collier's paper.

I called this blog intropy because it was a reasonably short word with links to 'information' that was still available as a short url...

I'd come across the word in the paper by Collier, though at the time I'd only scanned the paper and didn't really understand what he was meaning by the word. I had another go at Collier's paper while on holiday last week and I still don't understand it... Well, not entirely, but here's some thoughts about it.

Collier says "The order [of a system] is sometimes called the "intropy" of the system (in contrast to its entropy)." Previously he has defined order as:

Order = Smax - Sact
where
Smax = klogP
and
Sact = -k∑ p(mi)log(p(mi))

Intropy is not a widely used term, and some authors imply it is a synonym for negentropy, but in another paper (Hierarchical Dynamical Information Systems With a Focus on Biology, John Collier Entropy v5 pp 100-124, 26 June 2003) Collier says that intropy is one type of negentropy (enformation is another).

All these terms, like Gibbs Free Energy, are to do with the amount of order available to do something useful.

Information, causality and the EPR paradox

Some thoughts linked to my earlier post about the EPR paradox, arising from Dretske: Knowledge and the flow of information.

I was arguing earlier that the issue with faster-than-light travel is to do with the communication of information. Ie, that the 'thing' you can't allow to go faster than light is information. But implicit in my reason for that was a direct link between causality and information: for A to cause B it is necessary to convey information from A to B. But Dretske argues that there is no such link between causality and information and, furthermore, argues that there can even be information flow backwards in time. If this were to be so, then clearly by argument about information flow in EPR would be incorrect.

Dretske says that A can cause B without information from from A to B, and that information can flow from A to B without A causing B. His arguments for these conclusions are illustrated by the following diagram (from Dretske, 1991, page 28).


The arrows show causal connections: s2 causes r2 etc. Dretske argues that the causal link does not tell us anything about the information flow, because the fact that other events at s can also cause r2 means that the information r2 gives about s is reduced. So knowing that s2 causes r2 tells us nothing about the information about s learned from r2: causal link does not imply information flow.

Similarly, s4 can cause any one of r1, r3, r4. Now r1 tells us s4, so r1 gives good information about s, even though we can't say 's4 causes r1'. Hence information flow does not imply causal connection.

As to information flowing backwards in time, see Figure 1.8 from Dretske (page 38).


There is no physical connection between B & C, yet knowledge of C can be gained from observation of B. Dretske argues that there is an informational link between B & C. (Figure 1.8 looks very like the EPR experiment.) Dretske comments
Nothing at B causes anything at C or vice versa; yet C contains information about B and B about C. If C is further from the transmitter than B, the events occurring at C may occur later in time than those at B. [...] This sounds strange only if the receipt of information is confused with causality. For, of course, no physical signal can travel backwards in time carrying information from C to B.

Thursday, 14 August 2008

1919

I mentioned yesterday the London Review of Books article by Michael Wood about Yeats' "Nineteen Hundred and Nineteen".

A couple of thoughts arising from it.

First, as with the painting, I'm seeing the poem on a scale with a novel. What I mean by that is that in some sense there is a whole novel-worth's of information in it. The whole poem can be represented by around 20,000 bits*, but what does that say about the information content of the poem? Surely the poem represents weeks, months, years of work by Yeats. I don't know how long he took to write it, but that is anyway only the tip of the iceberg. In some sense it contains input from all his experience and learning up to the date of composition. And all that input is matched by the output for the reader. That's why I needed the explanation, and why it bears reading and re-reading time after time.

* The text of the poem (with title, line spacing and stanza headings (I, II...)) has 4824 characters. Coded with 7-bit ASCII that would be 33,768 bits. As a text file it was 5141 bytes, which is 41,128 bits. Compressing the text file with winzip got it down to 2666 bytes, which is 21,328 bits.

Second, something totally unrelated, but an observation on something Wood says:
The shock of this moment, this breaking of a vast illusion [in shorthand, this illusion is the 'liberal dream'- my words - but you need to read the whole article], is pictured as having violence at its heart, and indeed as possible only through violence. The picture doesn’t make this or any other violence acceptable or welcome, but it does mean that it’s hard to deny or even deplore the new truth, because the truth is always in one respect an improvement on fantasy – in one respect only, I hasten to add, since fantasy is an improvement on truth in every other way. That is what fantasy is for. Still, this respect is important. One can’t build on error, and with truth there is at least a chance. At the same time this new perception so completely wrecks the past that for the moment the wreck is all that can be seen. Violence is the name of this wreck; it is whatever brings to our minds that knowledge which we cannot and will not gain otherwise. It would be desperate and in a horrible way romantic to believe that there is no other way in which we can ever get the knowledge we need, and we should certainly try to do without uncontrollable mysteries if we can. But it is clear that once days are dragon-ridden, however we explain the arrival of the dragon, neither the past nor the future can be the same.
The bits I'm focussing on I've put in bold, but I'm quoting the whole paragraph because increasingly my theme is context, and extracting short quotes is the big crime against context.

Anyway, my point is that you have to be so, so, careful when you start talking about 'truth' and 'error'. Even in science I have problems with, say, Newtonian physics being 'wrong' and relativity being 'right', but all the more in human affairs. Writing in T324 about critical reading, a colleague included in a list of things to be aware of "assumptions about the truth". I think that is so important, and is something we have to be alert to all the time. And assumptions are so-often self-fulfilling. It is like one of the standard ideas of child-rearing: if you assume a child is naughty they will be. If you assume a child is responsible they will be (well, sometimes...).

Wednesday, 13 August 2008

Between poetry and painting: Houédard

A curious item at the British library: "Between Poetry and Painting Chronology" by Dom Sylvester Houédard ("dsh").

Searching for "Information Theory" and "Art" on the BL catalogue, I landed on this document. It was 14 pages of typescript on yellow/orange sheets of paper (roughly 2/3 A4 size - I don't know what that is called, if anything).

I photocopied the first two pages and have put scanned versions below. (I don't have copyright permission, but I don't think anyone will object - see below, about ubuweb and their faq#4.)





I didn't realise it at the time, but further investigation seems to suggest it is associated with the exhibition catalogue for the Institution of Contempory Arts in London, 1965.

As ever, making my own connections, serendipity makes this link between poetry and painting work for me. In the current London Review of Books there is an article about W. B. Yeats' Nineteen Hundred and Nineteen. I was struck by the way in which I needed the article to make the poem 'work' for me - it was entirely the same experience as I had with the painting of the Bellilli Family by Degas, that I discussed a while back. In both cases, though I needed the explanations, the explanations do not replace the poem/painting. The poem/painting retains a depth beyond the explanation.

Searching the web for information on Houédard, I found him on ubuweb. (He does not, yet, have an entry in wikipedia: I'm quite pleased about that... but should I add him? )