Monthly Archives: April 2012

Man U’s goals against

The extraordinary scoreline at Old Trafford led to a piece in the Guardian’s Five Things about how important Vidic is to the team’s defensive performance.

Several of their most egregious defensive failings have come against teams in European competition but here is a comparison of this year’s league performances in comparison with other EPL seasons

For the fifth season running, the most likely outcome will be a clean sheet. Other than the six goal aberration against City, the most interesting tid-bit from this year is that they have only conceded precisely 2 goals on one occasion, Even then, they could have been more profligate without much impact as it was in the game they scored eight against Arsenal

Losing their Spurs

In mid February, Spurs appeared a lock to have their best season in.. well quite a while. Coming off a 5-0 drubbing of the impressive Newcastle team, they were firmly esconced in third place with fans harbouring slight championship hopes rather than worrying about 4th placed Arsenal, 10 points in arrears. All looked set for a first season since 1994/5 finishing ahead of their North London rivals, with a spot in next season’s Chammpions League almost a formality

Not so fast.

They currently trail the Gunners by five points and have Newcastle and Chelsea snipping at their heels. Over the past eight games they have garnered just one win and six points – only seemingly-doomed Wolves have fared worse over this time period.

How does their worst run over eight games this season compare with other teams.

Wigan have produced yet another season with an eight game losing streak. In nether of the previous two occasions have they been relegated but third time unlucky is certainly on the cards. Liverpool’s poor recent run – ending with their last minute win against Blackburn – has already resulted in the General Manager’s dismissal

The last time Spurs suffered a worse run was at the beginning of the 2008-9 season when they only sported two points from the first eight games, three points adrift at the foot of the table and already 18 points behind joint leaders, Chelsea and Liverpool.

Unsurprisingly the manager, Juande Ramos was sacked and the Harry Redknapp era began

Here is Spurs worst eight match run for each EPL season

The 2003/4 two pointer was of little significance for the team as Spurs were in mid table team when the run started in March and were never in danger of relegation: the 37 points they had already garnered was more than any relegated team managed.

They were also a mid table team when they went into their New Year day 1994 home game with Coventry. A loss then was followed by six more and another three draws made it a double-digit winless streak. They struggled through the remainder of the year finishing just three points clear of the relegation trapdoor. Ossie Ardiles, the manager, was enough of an icon to keep his job into the following season but some big name signings, including Jurgen Klinsmann, failed to lift them above above mid-table mediocrity and he was let go

Of the 45 teams who have ever graced the Premiership, eight have had at least one sequence of 8 defeats during a season . Here are the results for teams that have spent at least half of the EPL era in the Premiership.

Only Swansea, in a yet-to-be-completed first season, can match Man United and with a current four game losing run that too is in danger. Other than Sheff Utd’s three season stint, all other clubs have had at least one eight game stretch in which they have mustered just a single win or three draws.

Quick off the mark

With none of the top teams overimpressing this season, Alan Pardew’s performance with Newcastle – especially in the transfer market -is likely to see him receive coniderable recognition in the Manager of the Year award

Recent acquisition, Papiss Cissé, has proved particularly fruitful with a brace against Swansea last time out taking him to nine goals in just eight outings. At half-time, a question posed in the Guardian was whether or not he was the fastest in EPL careers to reach the eight strikes mark. Sounded like a good excuse to haul Rs. If you are only interested in results check out the tables below

The data I need to utilize is on a MSSQL database so first thing to do is load the RODBC package and execute a query
The query extracts every match played by all players (the Where statement ensures that bench-only appearances are not included) and the goals scored.

?View Code RSPLUS
library(RODBC)
channel <- odbcConnect("eplR")
 
goalGames <- sqlQuery(channel,paste(
"
SELECT     soccer.tblTeam_Names.TEAMNAME as team, soccer.tblPlayers.PLAYERID, 
CASE when soccer.tblPlayers.FIRSTNAME is null then soccer.tblPlayers.LASTNAME else soccer.tblPlayers.FIRSTNAME + ' ' + soccer.tblPlayers.LASTNAME end as name, soccer.tblMatch.DATE
as gameDate, soccer.tblPlayer_Match.GOALS as goals
FROM         soccer.tblPlayer_Match INNER JOIN
                      soccer.tblPlayers INNER JOIN
                      soccer.tblPlayerClub ON soccer.tblPlayers.PLAYERID = soccer.tblPlayerClub.PLAYERID ON soccer.tblPlayer_Match.PLAYER_TEAM = soccer.tblPlayerClub.PLAYER_TEAM INNER JOIN
                      soccer.tblMatchTeam ON soccer.tblPlayer_Match.TEAMMATCHID = soccer.tblMatchTeam.TEAMMATCHID INNER JOIN
                      soccer.tblMatch ON soccer.tblMatchTeam.MATCHID = soccer.tblMatch.MATCHID INNER JOIN
                      soccer.tblTeam_Names ON soccer.tblMatchTeam.TEAMID = soccer.tblTeam_Names.TEAMID
where (soccer.tblPlayer_Match.START + soccer.tblPlayer_Match.[ON]) > 0
ORDER BY soccer.tblMatch.DATE
"
));
odbcClose(channel)

I want to add columns to show both the game order – there are no double-headers in the EPL so the date reflect the game – and also a cumulative sum of the goals scored. To assist in this I utilize the popular, plyr package

?View Code RSPLUS
library(plyr)
 
# PLAYERID is unique, name is user-friendly
goalGames <- ddply(goalGames,c("PLAYERID","name"), transform,
games = 1:NROW(piece),cumGoals = cumsum(goals))
 
# this what the dataframe now looks like
head(goalGames,1)
team         PLAYERID            name   gameDate goals games cumGoals
1 Ipswich T  ABIDALN Nabil Abidallah 2001-02-24     0     1        0

The original question posed was whether or not Cisse was the quickest to reach the eight goal mark. The following code shows that his impressive goal per game mark still did not hack it

?View Code RSPLUS
# Note the inequality as a player might score more than one goal in a game
goalCount <- 8
minGames <- min(subset(goalGames,cumGoals>=goalCount)$games) #  Ans 5
 
# Now find the rows that fit the 8 goals in 5 games criteria
# and reduce the answer to relevant columns only
fastest <-subset(goalGames,cumGoals>=goalCount&games==minGames)[,c(2,3,6,7)]
 
print(fastest) # Uh-Oh we have a tie
PLAYERID          name games cumGoals
1215    AGUEROS Sergio Aguero     5        8
149644   QUINNM    Mick Quinn     5        8

In order to determine exactly who takes the biscuit, we need to obtain data on the actual time of goals scored, apply a ddply
to this result and then filter th dataframe for the players, games and goals under consideration

?View Code RSPLUS
 
channel <- odbcConnect("eplR")
goalTimes <- sqlQuery(channel,paste(
"
SELECT     soccer.tblTeam_Names.TEAMNAME as team, soccer.tblPlayers.PLAYERID,
 CASE when soccer.tblPlayers.FIRSTNAME is null then soccer.tblPlayers.LASTNAME else soccer.tblPlayers.FIRSTNAME + ' ' + soccer.tblPlayers.LASTNAME end as name, soccer.tblMatch.DATE as gameDate, soccer.tblGoals.time
 
FROM         soccer.tblPlayer_Match INNER JOIN
                      soccer.tblPlayers INNER JOIN
                      soccer.tblPlayerClub ON soccer.tblPlayers.PLAYERID = soccer.tblPlayerClub.PLAYERID ON soccer.tblPlayer_Match.PLAYER_TEAM = soccer.tblPlayerClub.PLAYER_TEAM INNER JOIN
                      soccer.tblMatchTeam ON soccer.tblPlayer_Match.TEAMMATCHID = soccer.tblMatchTeam.TEAMMATCHID INNER JOIN
                      soccer.tblMatch ON soccer.tblMatchTeam.MATCHID = soccer.tblMatch.MATCHID INNER JOIN
                      soccer.tblTeam_Names ON soccer.tblMatchTeam.TEAMID = soccer.tblTeam_Names.TEAMID  INNER JOIN
                      soccer.tblGoals ON soccer.tblPlayer_Match.PLAYER_MATCH = soccer.tblGoals.PLAYER_MATCH
WHERE     (soccer.tblPlayer_Match.START + soccer.tblPlayer_Match.[ON] > 0)  
ORDER BY soccer.tblMatch.DATE,soccer.tblGoals.time 
 
"
));
 
odbcClose(channel)
 
# obtain ID's of tied game players
ties <- fastest$PLAYERID
 
# add cumulative goals to the goalTimes dataframe
goalTimes <- ddply(goalTimes,c("PLAYERID","name"), transform,sumGoals = 1:NROW(piece) )
final <- arrange(subset(goalTimes,PLAYERID %in% ties&sumGoals==goalCount),time)
 
print(final[1,3,5,6]) # and ladies and gentleman the winner is...
team          name time sumGoals
1  Man. City Sergio Aguero   46        8
2 Coventry C    Mick Quinn   73        8

That’s a lot of work for one result. It is a pretty simple matter to extend the result to show data for fastest player to one, two, three goals etc. Open code box below to see full details

?View Code RSPLUS
library(RODBC) 
 
channel <- odbcConnect("eplR")
 
goalGames <- sqlQuery(channel,paste(
"
SELECT     soccer.tblTeam_Names.TEAMNAME as team, soccer.tblPlayers.PLAYERID, 
CASE when soccer.tblPlayers.FIRSTNAME is null then soccer.tblPlayers.LASTNAME else soccer.tblPlayers.FIRSTNAME + ' ' + soccer.tblPlayers.LASTNAME end as name, soccer.tblMatch.DATE
as gameDate, soccer.tblPlayer_Match.GOALS as goals
FROM         soccer.tblPlayer_Match INNER JOIN
                      soccer.tblPlayers INNER JOIN
                      soccer.tblPlayerClub ON soccer.tblPlayers.PLAYERID = soccer.tblPlayerClub.PLAYERID ON soccer.tblPlayer_Match.PLAYER_TEAM = soccer.tblPlayerClub.PLAYER_TEAM INNER JOIN
                      soccer.tblMatchTeam ON soccer.tblPlayer_Match.TEAMMATCHID = soccer.tblMatchTeam.TEAMMATCHID INNER JOIN
                      soccer.tblMatch ON soccer.tblMatchTeam.MATCHID = soccer.tblMatch.MATCHID INNER JOIN
                      soccer.tblTeam_Names ON soccer.tblMatchTeam.TEAMID = soccer.tblTeam_Names.TEAMID
where (soccer.tblPlayer_Match.START + soccer.tblPlayer_Match.[ON]) > 0
ORDER BY soccer.tblMatch.DATE
"
));
goalTimes <- sqlQuery(channel,paste(
"
SELECT     soccer.tblTeam_Names.TEAMNAME as team, soccer.tblPlayers.PLAYERID,
 CASE when soccer.tblPlayers.FIRSTNAME is null then soccer.tblPlayers.LASTNAME else soccer.tblPlayers.FIRSTNAME + ' ' + soccer.tblPlayers.LASTNAME end as name, soccer.tblMatch.DATE as gameDate, soccer.tblGoals.time
 
FROM         soccer.tblPlayer_Match INNER JOIN
                      soccer.tblPlayers INNER JOIN
                      soccer.tblPlayerClub ON soccer.tblPlayers.PLAYERID = soccer.tblPlayerClub.PLAYERID ON soccer.tblPlayer_Match.PLAYER_TEAM = soccer.tblPlayerClub.PLAYER_TEAM INNER JOIN
                      soccer.tblMatchTeam ON soccer.tblPlayer_Match.TEAMMATCHID = soccer.tblMatchTeam.TEAMMATCHID INNER JOIN
                      soccer.tblMatch ON soccer.tblMatchTeam.MATCHID = soccer.tblMatch.MATCHID INNER JOIN
                      soccer.tblTeam_Names ON soccer.tblMatchTeam.TEAMID = soccer.tblTeam_Names.TEAMID  INNER JOIN
                      soccer.tblGoals ON soccer.tblPlayer_Match.PLAYER_MATCH = soccer.tblGoals.PLAYER_MATCH
WHERE     (soccer.tblPlayer_Match.START + soccer.tblPlayer_Match.[ON] > 0)  
ORDER BY soccer.tblMatch.DATE,soccer.tblGoals.time 
 
"
));
 
odbcClose(channel)
 
goalGames <- ddply(goalGames,c("PLAYERID","name"), transform,
games = 1:NROW(piece), cumGoals = cumsum(goals))
 
goalTimes <- ddply(goalTimes,c("PLAYERID","name"), transform,
sumGoals = 1:NROW(piece) )
 
# create df to hold results. count is number of players reaching each goal mark
myTable <- data.frame(player=character(),goals=integer(),
game=integer(),time=integer(),count=integer())
 
for(goalCount in 1:200) {
#goalCount <- 8
minGames <- min(subset(goalGames,cumGoals>=goalCount)$games)
playerCount <-  subset(goalGames,cumGoals>=goalCount)
count <-length(unique(playerCount$name))
 
fastest <-subset(goalGames,cumGoals>=goalCount&games==minGames)[,c(2,3,6,7)]
 
if (nrow(fastest) > 1)
{
 
ties <- fastest$PLAYERID
final <- head(arrange(subset(goalTimes,PLAYERID %in% ties&sumGoals==goalCount),time),1)
answer <- data.frame(player=final$name,goals=goalCount,game=minGames,minute=final$time,count=count)
} else {
answer <- data.frame(player=fastest$name,goals=goalCount,game=minGames,minute="",count=count)
}
 
# add number of players who have achieved it
myTable <- rbind(myTable,answer)
 
}
print(myTable)

All time Leaders

An abbreviated result is shown below

There are several points of interest

  • Nasri was only 21 when he converted a penalty in his debut. Brian Deane scored the very first goal in the EPL five minutes in
  • Ravanelli is the only player to score a hat-trick on his debut
  • Pogrebybak was fastest to five goals earlier this season
  • Cole and Shearer vied for the lead at around the 50 goal mark. The last occasion where a time tie break is required was
    when both of them scored their 43rd goal in 52 games. Shearer left it late but his 82nd minute strike topped Cole by four minutes
  • van Nistelrooy grabbed his 56th goal in his 74th appearance but failed to score in his subsequent four outings. Shearer nabbed his 55th in game 74, powered in eight more in the next five and holds the record for all subsequent goals reached

Individual Club

Further analysis can be done by club by simply extending the ddply to include team

Here are Man U’s figures

  • 10 different Man U players scored on their debut
  • van Nistelrooy played for the team in his prime. Players like Ronaldo and Rooney started much younger
  • Uniteds two other 100 goal scorers, Scholes and Giggs, converted at half the rate Rooney has

Individual Season

To look at this season’s data I need to take the bin the game date data I have into its appropriate EPL season. For this I use the cut function and then apply a new ddply function

?View Code RSPLUS
years <- 1992:2012
goalGames$season <- cut(goalGames$gameDate,  breaks=as.POSIXct(paste(years,"-08-01",sep="")),
  labels=paste(years[-length(years)],years[-length(years)]+1,sep="/"))
goalTimes$season <- cut(goalTimes$gameDate,  breaks=as.POSIXct(paste(years,"-08-01",sep="")),
  labels=paste(years[-length(years)],years[-length(years)]+1,sep="/"))

  • Suarez actually tied Djibrill Cisse on the, surprisingly late, 12 minute mark of the season
  • Dzeko’s early season success was not maintained. typifying Man City’s campaign
  • Rooney has played in less games so his scoring has lagged van Persie. However, they both reached the 20 goal mark in their 24th game

Some Blue Jay starter history

The new baseball season is nigh and hopes are sky-high

Well at least for the Toronto Blue Jay’s there is some optimism although seasons 2013 onwards are much more likely to see them in the playoffs. This year, the offence – with full campaigns from Lawrie, Johnson and Rasmus – should be improved and the overhauled bullpen will almost certainly be better.

But the rotation has the feeling of “Romero and Morrow. Oh sheesh it’s tomorrow”. McGowan (yet another injury) and Cecil (yet another decline in velocity) are already lost from the starting five and the young guns are not yet ready to compete at the ML level

Let’s look at some history

162 games except: 1981, 106; 1994 115; 1995 144 – * indicates lefty

The first table shows the number of starters used including who led the team and brought up the rear for each season. Also shown is the games commenced by the five most used starters. The last stat peaked in 1984 when the BJ’s ran a four man rotation, with Clancy, Steib, Alexander and Leal all starting at least 35 games – something no Blue Jay has done in the past eight seasons
Halladay has headed the rotation in terms of starts the most times (5) but during some of their strongest seasons 1987-1993 there were seven different game leaders. Indeed, until Guzman, in strike-shortened 1994, no player had repeated as the head of this category.

One relationship where there could be a correlation is between number of starts by the leading five players and win percetage. I have excluded the three strike-effect seasons

The graph shows some correlation, but without knowing the history in detail it is difficult to determine if the enforced use of more starters led to a lower win percentage or if poor performance led to more starters being used as the season progressed. However, if you can coax more than 135 starts out of five guys then you should at least reach the 81-81 mark

Unlike the slight, if any, increase in number of starters used over time, the final table does show a significant reduction in the number of complete games

In this category, Dave Stieb is the clear leader with 103 completions – accounting for more than a quarter of his starts. His record of 19 in 1982 is as many as the whole rotation has put up in a single year since 1985. Halladay is third on the list and his 17% conversion would likely have been at Steib’s level if he had pitched in that era

Simulated War

I am quite interested in both Wars with sabres and Sabremetric WARs but the War I am most involved in is the card game. Unfortunately, it is one my six year old favourites and he is quite happy to while away the hours (literally) playing it with anyone pressganged into joining him I must admit that – conscientous objector that I am – whenever a war is underway and my sons attention is elsewhere I try to slip a low card to the top of my pile which has the double attraction of both making it more likely that he wins and speeding up the conclusion of hostilities. However, I thought it might be interesting to see how long games would take without such intervention by attempting to code the process and do a simulation I often use R for analysis towards blog posts and in this instance thought it might be worthwhile showing my code. No doubt there are several improvements that could be suggested . The wiki article does post simulation results but the game I play has a couple of variations making comparisons more interesting The first thing to do is to create a deck (I have used numbers 11 through 14 for JQKA)

?View Code RSPLUS
# create a regular deck.
# All suits are equivalent so there will be four of each number
deck <- rep(2:14,4)

Then a random sample is provided to each player(p1,p2). The first 26 are pretty simple. However, the setdiff function in base R does not seem to handle duplicates so I utilized the package ‘sets’. Even that was a little tricky but as usual the answer to a stackoverflow query set me on the right track

?View Code RSPLUS
# make results reproducible
# set.seed(1066) 
assign("p1", sample(deck,26, replace=FALSE))
 
diffs <- gset_difference(as.gset(deck), as.gset(p1))
# create vector
p2 <- rep(unlist(diffs), times=gset_memberships(diffs))
# this produces the right cards but in order so randomize
 assign("p2", sample(p2,26, replace=FALSE))
p1
# [1]  3  3  6 14 10 13  3 10 12  5 11  8  2  5  8  4 10  5  8  4
# [21]  8  7  7  7  9 14
p2
# [1] 12  6  9  5  9  9 13  4  2 14  4 13 13  6  6 11  7 11 12  2
# [21] 10  3  2 11 12 14

In the analysis, I may be interested in the starting conditions of each player’s hand so I need to compute the overall value, the number of aces and – as 2′s trump aces in my variation – dueces. Only one players data is required but I will need to add the result after each game. For now I put in a “N” value

?View Code RSPLUS
p1Cards <- length(p1)
strength <- sum(p1) # 196 total of players is always 416 so p2 has stronger hand
aces <- sum(p1==14) # 2
deuces <- sum(p1==2) # 1
result <- "N"
 
 
 
game <- data.frame(id=i, strength=strength,aces=aces,deuces=deuces,result=result, stringsAsFactors=FALSE)
 
draw <- c(p1[1],p2[1])
 
booty <- c()

Now the game can start as the top card in each players deck is drawn. There are three outcomes. Either one of the players has the higher card and wins the battle or it is a tie – and a war ensues. Here p1 draws a 3 ad p2 a Queen(12) so p2 takes the drawn cards and adds them to the bottom of his deck. Here is the relevant code

?View Code RSPLUS
if (p2[1]>p1[1]) {
p2 <- c(p2[-1],draw)  
p1 <- p1[-1]
}

Of course the code needs to take account of the occasion when p1′s card is higher. There is also the wrinkle mentioned above where an Ace is trumped by a 2. Open box to see full code

?View Code RSPLUS
if (p1[1]>p2[1]) {      
if (p1[1]==14&p2[1]==2){ #  ace(14) vs a 2
p2 <- c(p2[-1],draw) 
p1 <- p1[-1]
} else {
p1 <- c(p1[-1],draw)
p2 <- p2[-1] 
}
} else if (p2[1]>p1[1]) { 
if (p2[1]==14&p1[1]==2){
p1 <- c(p1[-1],draw)
p2 <- p2[-1] 
} else {
p2 <- c(p2[-1],draw)  
p1 <- p1[-1]
}
}

The ‘fun’ starts when the cards match – which in this simulation does not occur until their final cards which are both aces. In this scenario, a variation from the wiki version, the matched cards and 3 more cards from each player form a ‘bounty’. The next card in each pack is then compared and the winner takes all 10 cards. If the cards match again the war scenario is repeated until one player proves victorious. There is also the possibility that one of the players runs out of cards and forfeits the game

?View Code RSPLUS
# keep running until displayed cards do not match
while (p1[1]==p2[1]) { 
 
# need at least 5 cards to play game
if (length(p1)<5|length(p2)<5) {
break 
}
# displayed card plus next three from each player
booty <- c(booty,p1[1],p1[2],p1[3],p1[4],p2[1],p2[2],p2[3],p2[4])
 
#  remove these cards from the p1,p2 so that new p1[1] is next shown
p1 <- p1[-(1:4)]
p2 <- p2[-(1:4)]
} 
 
draw <- c(p1[1],p2[1])
 
if (p1[1]>p2[1]) {
p1 <- c(p1[-1],booty,draw)
p2 <- p2[-1]
} else {
p2 <- c(p2[-1],booty,draw)
p1 <- p1[-1]
}

This scenario is repeated until one player is out of cards and the game is over. According to the wiki article, there is the possibility of an infinite loop being established so action is required to avoid that circumstance. Here is the relevant code

?View Code RSPLUS
# keep running total of deck size
p1Cards <- c(p1Cards,length(p1))
 
# test for game over
if(length(p1)==52|length(p1)==0){
break
}
# avoid infinite loop
if (length(p1Cards) > 5000) {
break
}
# reset for next iteration
booty <- c()
draw <- c(p1[1],p2[1])

After each game, I wish to record both the summary details and the trend in deck size

?View Code RSPLUS
# First calculate result and add to df
if (max(p1Cards)<52) {
p1Cards <-   -(p1Cards-52)
game$result = "L"
} else {
game$result = "W"
}
 
games <- rbind(games,game)
deckSize <- data.frame(i,p1Cards)
deckSizes <- rbind(deckSizes,deckSize)

The final stage is to simulate this repeatedly – I settled on 1000 – and save the data for subsequent analysis and a future blog post The full code is given below

?View Code RSPLUS
# Code for replicating War card Game
 
# Andrew Clark April 01 2012
 
library(sets)
 
# make simulation replicable
set.seed(1068)
 
games <- data.frame(id=numeric(), strength=numeric(),aces=numeric(),deuces=numeric(),result=character())
deckSizes <- data.frame(id=numeric(),details=numeric())
i <- 1
for (i in 1:10) {
# create a regular deck.
# All suits are equivalent so there will be four of each number
deck <- rep(2:14,4)
 
# make results reproducible - no longer required here
# set.seed(1066) 
assign("p1", sample(deck,26, replace=FALSE))
 
diffs <- gset_difference(as.gset(deck), as.gset(p1))
# create vector
p2 <- rep(unlist(diffs), times=gset_memberships(diffs))
# this produces the right cards but in order so randomize
 assign("p2", sample(p2,26, replace=FALSE))
p1
# [1]  3  3  6 14 10 13  3 10 12  5 11  8  2  5  8  4 10  5  8  4
# [21]  8  7  7  7  9 14
p2
# [1] 12  6  9  5  9  9 13  4  2 14  4 13 13  6  6 11  7 11 12  2
# [21] 10  3  2 11 12 14
p1Cards <- length(p1)
strength <- sum(p1) # 196 total of players is always 416 so p2 has stronger hand
aces <- sum(p1==14) # 2
deuces <- sum(p1==2) # 1
result <- "N"
 
 
 
game <- data.frame(id=i, strength=strength,aces=aces,deuces=deuces,result=result, stringsAsFactors=FALSE)
 
draw <- c(p1[1],p2[1])
 
booty <- c()
 
repeat { # for each match of cards
if (p1[1]>p2[1]) {      
if (p1[1]==14&p2[1]==2){ #  ace(14) vs a 2
p2 <- c(p2[-1],draw) 
p1 <- p1[-1]
} else {
p1 <- c(p1[-1],draw)
p2 <- p2[-1] 
}
} else if (p2[1]>p1[1]) { 
if (p2[1]==14&p1[1]==2){
p1 <- c(p1[-1],draw)
p2 <- p2[-1] 
} else {
p2 <- c(p2[-1],draw)  
p1 <- p1[-1]
}
} else {
while (p1[1]==p2[1]) { 
 
# need at least 5 cards to play game
if (length(p1)<5|length(p2)<5) {
break 
}
# displayed card plus next three from each player
booty <- c(booty,p1[1],p1[2],p1[3],p1[4],p2[1],p2[2],p2[3],p2[4])
 
#  remove these cards from the p1,p2 so that new p1[1] is next shown
p1 <- p1[-(1:4)]
p2 <- p2[-(1:4)]
} 
 
draw <- c(p1[1],p2[1])
 
if (p1[1]>p2[1]) {
p1 <- c(p1[-1],booty,draw)
p2 <- p2[-1]
} else {
p2 <- c(p2[-1],booty,draw)
p1 <- p1[-1]
}
}
#  battle over
 
# keep running total of deck size
p1Cards <- c(p1Cards,length(p1))
 
# test for game over
if(length(p1)==52|length(p1)==0){
break
}
# avoid infinite loop
if (length(p1Cards) > 5000) {
break
}
# reset for next iteration
booty <- c()
draw <- c(p1[1],p2[1])
 
} 
# war over
 
 
if (max(p1Cards)<52) {
p1Cards <-   -(p1Cards-52)
game$result = "L"
} else {
game$result = "W"
}
 
games <- rbind(games,game)
deckSize <- data.frame(i,p1Cards)
deckSizes <- rbind(deckSizes,deckSize)
 
}
 
# save for later analysis
write.table(games,"games1000random.csv")
write.table(deckSizes,"deckSizes1000random.csv")