MP vs IMPs

5 Pages
1
2
3
→
Last »

You cannot start a new topic
You cannot reply to this topic

MP vs IMPs

#1 helene_t

The Abbess

Group: Advanced Members
Posts: 17,198
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2014-December-12, 07:43

The thread about bypassing spades in 1m-1♥-1NT has become a blend of two different discussions so I thought maybe it's better to start a seperate thread about MPs vs IMPs.

Obviously (cross)IMPs work best (in terms of identifying the best pair) if we assume that big swings (such as making or not making a slam) reflect skill more than small swings (such as overtricks or part score sacs). This will obviously to some extent be the case, if only because a big swing can occur as an aggregate of several good decisions, for example first jamming opps auction effectively and then double them and then defend well.

On the other hand, MPs work best if some boards allow skill to translate into bigger swings than other boards do.

In addition, I thought that MPs work better in large fields but I am not sure if that is really true.

So I made some simulations of a 27 board mitchel (27*1 board, 9*3 or 3*9) in which I assumed that the raw score on a board was normal distributed with a mean value of
E(rawscore[board,nspair,ewpair]) = skillfactor[board] * (strength[ns]-strength[ew])
where the skillfactor was gamma distributed across boards with a shape parameter which I allowed to vary between simulations. Rate=Shape to keep the average skill factor constant between sims.

The variance of the raw score was gamma distributed across boards, independent of the skill factor.

Before calculating IMPs and MPs I rounded off to nearest multiple of 50 to allow for ties at matchpoints (rounding also applied for IMP scoring for a fairer comparison). I used butler scoring without outlier removal.

The average Spearman correlations between strength and IMP scoring was (as a function of shape parameter of the skill factor distribution and number of tables):

  .1;3  .1;9   .1;27  1;3    1;9    1;27   10;3   10;9   10;27 
 0.756  0.833  0.864  0.877  0.929  0.952  0.920  0.960  0.979

For MPs:

  .1;3  .1;9   .1;27  1;3    1;9    1;27   10;3   10;9   10;27 
0.744  0.844  0.880  0.869  0.931  0.957  0.920  0.962  0.980

So it looks like that for large values of the shape parameter (i.e. the skill factor is roughly the same for all boards), it doesn't matter which scoring you use, and this hold regardless of field size. But for more heterogenous sets of boards (low value of the skill factor shape parameters), MPs is better for large fields and butler is better for small fields, with a break even somewhere halfway between 3 and 9 tables.

Based on 9000 sims, using both the ew and the ns ranking so 18000 data points per parameter combination.

Maybe I should have a go with correlated noise and skill factor, which is probably realistic. This would favour matchpoints, I would think.

Of course this is all based on a huge number of simplifications and assumptions. It would be cool if someone could do a similar analysis of real data.

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#2 Vampyr

Group: Advanced Members
Posts: 10,611
Joined: 2009-September-15
Gender:Female
Location:London

Posted 2014-December-12, 08:29

helene_t, on 2014-December-12, 07:43, said:

Obviously (cross)IMPs work best (in terms of identifying the best pair) if we assume that big swings (such as making or not making a slam) reflect skill more than small swings

But losing big swings is often unrelated to skill, as was pointed out In the other thread.

I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones -- Albert Einstein

#3 whereagles

Group: Advanced Members
Posts: 14,900
Joined: 2004-May-11
Gender:Male
Location:Portugal
Interests:Everything!

Posted 2014-December-12, 10:31

@Helene: I haven't had the time yet to digest the post, but at first sight I'd say the correlation differences can be attributed to statistical fluctuations.

Or have I gotten it wrong?

#4 helene_t

The Abbess

Group: Advanced Members
Posts: 17,198
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2014-December-12, 11:06

Nono the standard errors are mostly less that 0.001

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#5 rhm

Group: Advanced Members
Posts: 3,092
Joined: 2005-June-27

Posted 2014-December-12, 11:21

I do not share the premise.
Different scoring lead to some extent to simply different games, which give different incentives and require different skills.
That's why there are some players, who do better at one form of the game than on the other.
There is little point in arguing which is "better" bridge or a fairer game or measures bridge skills better.

Rainer Herrmann

#6 steve2005

Group: Advanced Members
Posts: 3,162
Joined: 2010-April-22
Gender:Male
Location:Hamilton, Canada
Interests:Bridge duh!

Posted 2014-December-12, 11:53

helene_t, on 2014-December-12, 11:06, said:

Nono the standard errors are mostly less that 0.001

That's a very small error for what is essentially one big guess. lol

Sarcasm is a state of mind

#7 barmar

Group: Admin
Posts: 21,591
Joined: 2004-August-21
Gender:Male

Posted 2014-December-12, 15:30

rhm, on 2014-December-12, 11:21, said:

Although the "best of the best" seem to be good at all forms of the game. E.g. teams containing Meckwell tend to do well both in KOs and BAM -- they're almost always strong contenders for Spingold, Vanderbilt, Bermuda Bowl, and Reisinger.

#8 Siegmund

Alchemist

Group: Advanced Members
Posts: 1,764
Joined: 2004-June-15
Gender:Male
Location:Beside a little lake in northwestern Montana
Interests:Creator of the 'grbbridge' LaTeX typesetting package.

Posted 2014-December-13, 11:17

You reached some different conclusions than I did, when I investigated some similar questions a while back, but we made very different assumptions, too. Some scattered thoughts:

Quote

I thought that MPs work better in large fields but I am not sure if that is really true.

This is not at all what I would expect. Whatever scoring method you use on a particular board, your result is determined 1/4 by your partnership, 1/4 by your table opponents, and 1/2 by the people against whom you are compared. Comparing against a large field diminishes the noise added by the second half, the "luck of who you are compared against". Whether you play matchpoints on a T top or are cross-imps against T other tables, the variance of your scores is proportional to 1+1/T. This can be confirmed by live results, too -- I was "blessed" with a club with a lot of 2 1/2 table games in the winter, a while back, so had some data to compare T=1,2,3,4 from real life, plus T=12 from regionals.

It is one reason I am surprised by the enduring popularity of head-to-head team matches, which are cursed with all the same extra randomness caused by only a single comparison that 2 1/2 table pairs games are. Non-statisticians seem to equate knowing the name of the source of the randomness and being able to yell at him after the session, with the result not being random.

* * *

Quote

if only because a big swing can occur as an aggregate of several good decisions, for example first jamming opps auction effectively and then double them and then defend well

This also reflects a different and imo rather unusual approach to the origin of swings. I've always taken the perspective that if nobody makes any mistakes, the expected score is close to average, and that swings occur only as a result of someone making a mistake -- whether that mistake is guessing the wrong final contract to play because the bidding has been jammed, or failing to double, or failing to defend right, or failing to declare right. Or, to put it another way, there is IMO no such thing as "creating" a swing by playing well -- only taking advantage of the opportunities for positive swings which your opponents create, and minimizing the number of opportunities for adverse swings that you create.

#9 cherdano

5555

Group: Advanced Members
Posts: 9,519
Joined: 2003-September-04
Gender:Male

Posted 2014-December-13, 11:44

helene_t, on 2014-December-12, 07:43, said:

So I made some simulations of a 27 board mitchel (27*1 board, 9*3 or 3*9) in which I assumed that the raw score on a board was normal distributed with a mean value of
E(rawscore[board,nspair,ewpair]) = skillfactor[board] * (strength[ns]-strength[ew])
where the skillfactor was gamma distributed across boards with a shape parameter which I allowed to vary between simulations. Rate=Shape to keep the average skill factor constant between sims.

The variance of the raw score was gamma distributed across boards, independent of the skill factor.

(Italics are my emphasis.)

I don't understand this assumption. If there is a large swing available by superior bidding or play, then I could probably stumble into that larger swing by pure luck?

I think this assumption negates the main reason that matchpoint scoring is more accurate. If I see a large swing in your simulation, then it is very likely based on skill (since the amount of points available by luck is constant across all boards).
I do not think the same can be said in the game of "contract bridge".

The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke

#10 helene_t

The Abbess

Group: Advanced Members
Posts: 17,198
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2014-December-13, 13:16

cherdano, on 2014-December-13, 11:44, said:

If there is a large swing available by superior bidding or play, then I could probably stumble into that larger swing by pure luck?

I think this assumption negates the main reason that matchpoint scoring is more accurate. If I see a large swing in your simulation, then it is very likely based on skill (since the amount of points available by luck is constant across all boards).
I do not think the same can be said in the game of "contract bridge".

Yes I think you are right. As I said, I should maybe have a go with correlated noise and skill factor, i.e. the swing boards have a larger luck component as well as a larger skill component.

Alternatively, if you think swing boards just have a larger luck factor, then we should keep the model as it is but the parameters for the noise distribution could be changed.

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#11 Vampyr

Group: Advanced Members
Posts: 10,611
Joined: 2009-September-15
Gender:Female
Location:London

Posted 2014-December-13, 14:42

Siegmund, on 2014-December-13, 11:17, said:

[color="#1c2837"][size=2]It is one reason I am surprised by the enduring popularity of head-to-head team matches, which are cursed with all the same extra randomness caused by only a single comparison that 2 1/2 table pairs games are.

This is very very different, because you are comparing with your teammates' table.

I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones -- Albert Einstein

#12 nige1

5-level belongs to me

Group: Advanced Members
Posts: 9,128
Joined: 2004-August-30
Gender:Male
Location:Glasgow Scotland
Interests:Poems Computers

Posted 2014-December-13, 15:40

Vampyr, on 2014-December-12, 08:29, said:

But losing big swings is often unrelated to skill, as was pointed out In the other thread.

I agree. Assuming that big-swings are more related to skill than small-swings seems to beg the question.

Guthrie.tech

#13 jogs

Group: Advanced Members
Posts: 1,316
Joined: 2011-March-01
Gender:Male
Interests:student of the game

Posted 2014-December-13, 18:16

steve2005, on 2014-December-12, 11:53, said:

That's a very small error for what is essentially one big guess. lol

The first differences between double dummy 1NT and observed results from actual play is huge.
Double dummy 1NT declarer averages less than 6.2 tricks.
Observed results declarer averages more than 6.8 tricks.
That is a first difference of over 0.6 tricks.

#14 cherdano

5555

Group: Advanced Members
Posts: 9,519
Joined: 2003-September-04
Gender:Male

Posted 2014-December-13, 18:38

helene_t, on 2014-December-13, 13:16, said:

I would do something different: take the results table from a big MP tourney. For each table, determine the percentile obtained at this table by a combination of luck and skill difference between the two pairs. I am not sure this is theoretically sound, but how else do you want to mimick a board where +650, +620, +200 and +170 are common results, and where both the difference between
- game bonus or not, and
- 11 tricks or 10 tricks
may potentially be attributed to either mostly skill, or mostly luck.

(I.e., my point is that even though the standard deviation on this board may be much much larger, a 30 point difference may well point to a skill difference here. That's the point of matchpoints.)

The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke

#15 barmar

Group: Admin
Posts: 21,591
Joined: 2004-August-21
Gender:Male

Posted 2014-December-13, 19:10

Siegmund, on 2014-December-13, 11:17, said:

Or, to put it another way, there is IMO no such thing as "creating" a swing by playing well -- only taking advantage of the opportunities for positive swings which your opponents create, and minimizing the number of opportunities for adverse swings that you create.

While you may not be able to create swings by playing "well", you can make things harder for the opponents, which gives them more opportunities to go wrong, which then creates swings. This is essentially why teams that are far behind in a match will bid more aggressively, as well as psyching heavily. They're challenging the opponents (who would otherwise play conservatively) to figure out what's going on.

#16 mgoetze

Group: Advanced Members
Posts: 4,942
Joined: 2005-January-28
Gender:Male
Location:Cologne, Germany
Interests:Sleeping, Eating

Posted 2014-December-13, 23:26

Siegmund, on 2014-December-13, 11:17, said:

Whatever scoring method you use on a particular board, your result is determined 1/4 by your partnership, 1/4 by your table opponents, and 1/2 by the people against whom you are compared.

How did you make up these numbers and what are they supposed to mean? If I play a board where it is obvious for my side to pass throughout, and the opps have an obvious claim for exactly 12 tricks at trick one, then my result is obviously determined "0%" by my partnership. Furthermore, at a given average skill level, I would expect the distribution of comparison scores to stabilize as the field gets larger, eventually making my table opponents on that particular hand the only relevant factor for determining my score.

"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
-- Bertrand Russell

#17 Siegmund

Alchemist

Group: Advanced Members
Posts: 1,764
Joined: 2004-June-15
Gender:Male
Location:Beside a little lake in northwestern Montana
Interests:Creator of the 'grbbridge' LaTeX typesetting package.

Posted 2014-December-14, 00:43

Quote

I thought it was self-evident, but I will use your example to try to make it clearer.

If you play a board where the opponents have an obvious 6H+6, and neither side does anything stupid, you expect to get an average board.

1) You or your partner have the power to guarantee yourself a bad board - by bidding 7C, by underleading an ace at trick 1, or whatever else.
2) Your opponents have the power to give you a good board - by failing to bid slam, or by bidding 7H, or fumbling the laim, or whatever else.
3) Each other pair in the room holding your cards, has the power to give himself a bad board, and give you one extra matchpoint, by misdefending. Collectively, all-the-other-people-holding-your-cards make 1/4 of the decisions that affect what your score on the board will be.
4) Each other pair in the holding the slam cards, has the power to give himself a bad board, and cost you one extra matchpoint, by misdeclaring. Collectively, all-the-other-people-holding-your-opponents-cards make 1/4 of the decisions that affect what your score on the board will be.

It so happens, on your example board, that you had an easy decision, and it wasn't particularly hard for you to do your part.

In my view, on EVERY board, there is SOME result that would happen if everybody at the table did everything right. When the angels play against each other in Heaven, every board has been a 50% board since Satan was cast out

In the real world, you can cost yourself your entitlement to an average board by a misjudgment, and so can your opponents, and so can the other people in the room. On the complicated boards where the bidding can go ten different ways, you very often make a mistake that your opponents could capitalize on (if they knew how.) Your table opponents make mistakes that give you chances back. The final outcome is a complicated mess -- determined by which side succeeded in throwing away more.

Over the course of the evening, we expect you to face approximately the same number of "interesting" decisions as your table opponents do, as the NS at other tables do, and as the EW at other tables do.

The way I choose to approach this statistics problem, we have the same question on every board -- which of the 4 groups of people makes mistakes, with what frequency and severity? The degree of difficulty faced by each side can be different on each board, as you observe. If you played an entire session of bridge able to see through the backs of your opponents' cards, but they couldn't see through yours, you would expect to get a score near 75%. You couldn't guarantee yourself more than that, because you can't force your opponents to make a mistake on every board, nor can you prevent the people at other tables from doing weird stuff so that your good boards won't always be tops.

Adapting Helene's model to my philosophy, for each board we would need to draw two difficulty scores, D_NS and D_EW, from some distribution. And for each pair, let's have some quality measurement Q_i that says how often, on a relative scale, that pair makes errors. To get a "table result" on a board, we do something like let M_i, the number of mistakes made by pair i on this board, be Poisson(D_NS * Q_i) or Poisson(D_EW * Q_i), and take M_1-M_2 as the "result" from one table, M_3-M_4 the "result" from the next, and matchpoint them. Or a more complicated model that allows errors of different sizes. The results won't depend much, qualitatively, on exactly how complicated of a model you use.

Quote

Furthermore, at a given average skill level, I would expect the distribution of comparison scores to stabilize as the field gets larger, eventually making my table opponents on that particular hand the only relevant factor for determining my score.

No argument at all that the distribution of scores from the rest of the field will stabilize as the number of comparisons on the board increases. (It stabilizes to a different distribution according to how strong the rest of the field is.)

#18 mgoetze

Group: Advanced Members
Posts: 4,942
Joined: 2005-January-28
Gender:Male
Location:Cologne, Germany
Interests:Sleeping, Eating

Posted 2014-December-14, 01:24

Siegmund, on 2014-December-14, 00:43, said:

[/size][/color]

I thought it was self-evident, but I will use your example to try to make it clearer.

I still don't even know what the numbers mean. If my partnership plays horribly, the opponents do nothing special, and the field does nothing special, we are not getting 25%*0%+25%*50%+50%*50% = 37.5% on the board. We are getting 0%.

Quote

If you play a board where the opponents have an obvious 6H+6, and neither side does anything stupid, you expect to get an average board.

I didn't say it was obvious, I'm assuming it's difficult to bid and/or easy to overbid.

Quote

1) You or your partner have the power to guarantee yourself a bad board - by bidding 7C, by underleading an ace at trick 1, or whatever else.

Perhaps. Perhaps not, there are enough slam boards where any sort of sacrifice is obviously absurd, and I could underlead my ace into their KQJ opposite xxx.

Quote

Over the course of the evening, we expect you to face approximately the same number of "interesting" decisions as your table opponents do, as the NS at other tables do, and as the EW at other tables do.

Aha, so this is your assumption. I find it pretty much never holds true in any sort of 1 day event. More crucially, if you want to discuss the difference between MPs and IMPs, it is vital that at IMPs the "interesting" decisions do not all have the same weight in determining your score.

Quote

And yet you claim that the result is "determined" by a constant "1/2" by the field. Again, what does this mean?

"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
-- Bertrand Russell

#19 jogs

Group: Advanced Members
Posts: 1,316
Joined: 2011-March-01
Gender:Male
Interests:student of the game

Posted 2014-December-15, 11:23

Random luck plays a huge role in results. Skill is linearly proportional to boards played. Luck is proportional to the square root of boards played.
You are on a good AX team playing world champions. You should be able to win 35-40% of 7 board matches. The longer the length of the match in terms of total boards the less likely your team can upset the champions.

#20 nige1

5-level belongs to me

Group: Advanced Members
Posts: 9,128
Joined: 2004-August-30
Gender:Male
Location:Glasgow Scotland
Interests:Poems Computers

Posted 2014-December-15, 13:30

rhm, on 2014-December-12, 11:21, said:

I do not share the premise. Different scoring lead to some extent to simply different games, which give different incentives and require different skills. That's why there are some players, who do better at one form of the game than on the other. There is little point in arguing which is "better" bridge or a fairer game or measures bridge skills better.

MPs and imps require slightly different skills but the skills seem to correlate and overlap. You would expect players who perform well at one form of the game to perform well at the other.

Guthrie.tech

5 Pages
1
2
3
→
Last »

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: MP vs IMPs - BBO Discussion Forums

MP vs IMPs

#1 helene_t

#2 Vampyr

#3 whereagles

#4 helene_t

#5 rhm

#6 steve2005

#7 barmar

#8 Siegmund

#9 cherdano

#10 helene_t

#11 Vampyr

#12 nige1

#13 jogs

#14 cherdano

#15 barmar

#16 mgoetze

#17 Siegmund

#18 mgoetze

#19 jogs

#20 nige1

9 User(s) are reading this topic
0 members, 9 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: MP vs IMPs - BBO Discussion Forums

MP vs IMPs

9 User(s) are reading this topic 0 members, 9 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

9 User(s) are reading this topic
0 members, 9 guests, 0 anonymous users