Table of contents
 Distribution of played maps
 Distribution of wins per map
 Distribution of winner team's sum of score/rank ("rank stacking")
 Player's "skill" versus time played (Learning curve)
 Effects of capturing/defending flags on win/lose ratio and SPM ("Does PTFOing help?")
 Method of gathering the sample & Things to note
Changelog:
 15th Jan: Initial post.
 21st Jan: Updated "learning curve (4.)" and "does PTFOing help (5.)"
1. Distribution of played maps
Quite simple: Following bunch of text shows distribution of played maps and their fractions.

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Map N Frac

Siege of Shanghai: 1127 (0.11) ##############################
Golmud Railway: 957 (0.10) #########################
Operation Locker: 912 (0.09) ########################
Zavod 311: 687 (0.07) ##################
Paracel Storm: 532 (0.05) ##############
Hainan Resort: 467 (0.05) ############
Rogue Transmission: 426 (0.04) ###########
Dawnbreaker: 422 (0.04) ###########
Flood Zone: 367 (0.04) #########
Lancang Dam: 354 (0.04) #########
Operation Metro: 347 (0.03) #########
Whiteout: 319 (0.03) ########
Giants of Karelia: 313 (0.03) ########
Hammerhead: 306 (0.03) ########
Hanger 21: 302 (0.03) ########
Caspian Border: 279 (0.03) #######
Pearl Market: 270 (0.03) #######
Silk Road: 266 (0.03) #######
Propaganda: 200 (0.02) #####
Gulf of Oman: 159 (0.02) ####
Operation Firestorm: 155 (0.02) ####
Guilin Peaks: 147 (0.01) ###
Operation Mortar: 97 (0.01) ##
Wave Breaker: 94 (0.01) ##
Lumphini Garden: 91 (0.01) ##
Dragon Pass: 88 (0.01) ##
Lost Islands: 85 (0.01) ##
Sunken Dragon: 79 (0.01) ##
Altai Range: 77 (0.01) ##
Nansha Strike: 75 (0.01) #

Of course, this is not the most fairest comparison as Final Stand was released about 23 months ago while max age of these reports were 4 months. However we can see
that original maps seem to be quite popular. Final Stand maps seem to come next which can be explained by it's freshness.
I can come up with two possible explanations for this distribution:
 There are more nonpremium/nonDLC players than premium players. Vanilla map servers are more active because nonpremium players > more players join them > more games played.
 New maps are somewhat "throwaway" and premium players move mostly to the new DLC maps when it comes out.
Of course, these are just assumptions and nothing can be proven by these results.
2. Distribution of wins per map
Next we have distribution of wins per map. Team nations can be changed in server settings but their number (1 or 2) is where their spawn is on the map.
"95% Conf." tells fraction of wins of the team who has more wins. Smaller interval > more accurate.
"Unbalance" indicates how much 95% Conf. interval's closer edge differs from the 0.5, if at all. One # = 0.005

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Map Team 1 vs. 2 95% Conf. "Unbalance"

Gulf of Oman 52 vs. 107 (0.67±0.07) ####################
Operation Mortar 65 vs. 32 (0.67±0.09) ###############
Rogue Transmission 274 vs. 152 (0.64±0.05) ###################
Propaganda 76 vs. 124 (0.62±0.07) ##########
Lumphini Garden 35 vs. 56 (0.62±0.10) ###
Nansha Strike 29 vs. 46 (0.61±0.11)
Giants of Karelia 188 vs. 125 (0.60±0.05) #########
Flood Zone 147 vs. 220 (0.60±0.05) #########
Hammerhead 127 vs. 179 (0.58±0.06) #####
Sunken Dragon 46 vs. 33 (0.58±0.11)
Wave Breaker 54 vs. 40 (0.57±0.10)
Caspian Border 159 vs. 120 (0.57±0.06) ##
Silk Road 118 vs. 148 (0.56±0.06)
Operation Metro 154 vs. 193 (0.56±0.05)
Whiteout 175 vs. 144 (0.55±0.05)
Guilin Peaks 67 vs. 80 (0.54±0.08)
Lancang Dam 192 vs. 162 (0.54±0.05)
Hainan Resort 251 vs. 216 (0.54±0.05)
Golmud Railway 449 vs. 508 (0.53±0.03)
Lost Islands 40 vs. 45 (0.53±0.11)
Paracel Storm 279 vs. 253 (0.52±0.04)
Dragon Pass 42 vs. 46 (0.52±0.10)
Zavod 311 329 vs. 358 (0.52±0.04)
Hanger 21 145 vs. 157 (0.52±0.06)
Siege of Shanghai 542 vs. 585 (0.52±0.03)
Operation Locker 442 vs. 470 (0.52±0.03)
Pearl Market 131 vs. 139 (0.51±0.06)
Operation Firestorm 79 vs. 76 (0.51±0.08)
Altai Range 39 vs. 38 (0.51±0.11)
Dawnbreaker 210 vs. 212 (0.50±0.05)

Maps are mostly balanced in statistical vision with some exceptions. Most unbalanced maps are big vehicle maps which include lots of things to balance and the size just makes it worse.
For some maps I wouldn't dare to assume if they are unbalanced or not due to lack of samples, especially after considering how sample was taken.
3. Distribution of winner team's sum of score/rank ("rank stacking")
Here we check if sum of rank per team affects the outcome of the game, inspired by my personal hatred towards clan/group stacking.
I calculated sum of ranks and scores (based on score needed for rank) of both teams, then substracted loser's sums from winner's (I simply use terms "rank" and "score" later on),
giving us nice and simple sample of "how much more sum rank/score winner had". Why I included score? Because gaining ranks is nonlinear so we can't substract ranks that simply, but I decided to give it a shot.
Image of distribution spoilered due to brightwhite background of plot (at least burns my eyes with forum's background color).
On the left we have score's distribution (boxplot and hist), right we have rank's distribution.
Both variables have statistically significant different mean from 0 (p <<< 0.01, ttest). 99% confidence interval for rank is 328351 and for score is 46.850.8, meaning a mean of ranks/scores is inside these intervals with 99% confidence (Not for invidual reports!).
There's definetly indication of rank/score differencies affecting on chances of winning positively, especially when from boxplots we can see how >75% of cases are above zero.
In layman's terms: Team with more highranked players has bigger chances of winning.
4. Player's "skill" versus time played (Learning curve)
3VerstsNorth requested a look at how players' skill improves as a function of time played, so lets get to it.
First lets define skill: We have few measurements for this like SPM, KD and KPM. They all tell bit about the player but not everything (eg. Player with KD just might be camper who doesn't help the team).
Lets combine these all into one variable as
pmax suggested me by summing zvalues of SPM, KPM and log(KD). Lets see the plot of this zproduct with observations rounded to closest 10 hours.
A beautiful learning curve. Plotting invidual variables SPM, KPM and log(KD) provide similar results. As the plot shows, there is more variance at right (more time played) and I limited graph to max 1000h due to small number of observations above that.
There's a clear pattern here: Player improves rapidly in first 200h but improving gradually slows down. The pattern is so clear we could try fitting a regression model to find out some values for this:
Formula: z_product = 4.37134 + 0.88833*ln(hours_played)
pvalues <<< 0.01 and R² = 0.92
Almost perfect fit, and this is what learning curves seem to be like quite often (Ask 3VerstsNorth, he seems to be acquainted with this). Tests confirm there's a definetly a connection between the two and high R² indicates that time played explains much of the variance of zproduct.
Again, in a nutshell: For about first 100 hours players "skill" doubles/triples. However after that it takes about 300h more to double "skill". After this point data is too scattered to say anything for sure but seems to follow logarithmic function.
Update 21st Jan
3VN requested analysing first 50h of gameplay time.
Formula: z_product = 3.46490 + 0.72937*ln(hours_played)
pvalues <<< 0.01 and R² = 0.93
Like with the larger data we get noticeable logarithmic function in player's improving. Judging by this, during the first few hours (<10h) player gets twice as better roughly, after which it takes ~30h to get twice as better from that.
The regression model agrees with the first one with slightly different coefficients, which can be explained by highly variating observations.
5. Effects of capturing/defending flags on win/lose ratio and SPM ("Does PTFOing help?")
Does PTFOing really help you to gain more score? Well lets find out!
Lets use flag captures/defends from conquest games as an indicator of "PTFOing" against win/lose ratio and SPM. Quite simply, we then calculate correlations between these different variables.
Correlation in short: Higher the correlation (closer to 1.0 or 1.0) indicates of dependency/connection between the two variables. Positive indicates of rising line, negative of descending line.
Doesn't tell if one causes the another!

Source code

1
2
3
4
5
6
7
8
9
10
11

logwlr = logarithm of win/lose ratio
flag_def/flag_cap = number of flag defends/captures
flag_rib/cq_rib = number of conquest flag cap ribbons and conquest win ribbons
logwlr SPM flag_cap flag_def flag_rib cq_rib
logwlr 1.0000000 0.5178820 0.3673040 0.4073538 0.3662498 0.3707486
SPM 0.5178820 1.0000000 0.4886269 0.5318530 0.4863264 0.4147895
flag_cap 0.3673040 0.4886269 1.0000000 0.9425577 0.9873665 0.9342006
flag_def 0.4073538 0.5318530 0.9425577 1.0000000 0.9236739 0.9560230
flag_rib 0.3662498 0.4863264 0.9873665 0.9236739 1.0000000 0.9273325
cq_rib 0.3707486 0.4147895 0.9342006 0.9560230 0.9273325 1.0000000

I used Spearman's correlation to compensate for scattered values and heavy outliers. All pvalues <<< 0.01 so values are statistically significant.
As a result we have quite high correlations: logwlr is about ~0.4 for objective related variables which is reasonably high and clearly indicates of a connection between these two.
SPM correlates even better with objective based variables. Conquest win ribbons correlating very strongly with flag variables can be explained by the fact you get both ribbons just by playing games, but it's still pretty high.
So basically: Yes, PTFOing improves your chances of winning and gives you more score, now with statistical proof.
HOWEVER, this doesn't mean PTFOing makes you win always! It just means there's a connection between the two!
Update 21st Jan
People suggested diving flag captures/defends with games played to get something similar to flags capture per game. Problem is that played games variable also includes rounds of other gamemodes, but lets give it a shot.

Source code

1
2
3
4
5

logwlr SPM game_flag_cap game_flag_def
logwlr 1.0000000 0.3763953 0.2153307 0.3694187
SPM 0.3763953 1.0000000 0.3819391 0.5360870
game_flag_cap 0.2153307 0.3819391 1.0000000 0.4695575
game_flag_def 0.3694187 0.5360870 0.4695575 1.0000000

Correlation between flag captures and win/lose ratio dropped almost 0.2 which is quite a bit, meanwhile flag defends only dropped by ~0.04 which could be well explained by the fact played rounds also included other gamemodes than conquest.
Also correlation with SPM had same kind of effect: Defending of flags didn't change and capturing flags lowered considerably.
This seems to support the conclusion/idea that defending flags is more benefitical for your score and possibly to your team.
6. Method of gathering the sample & Things to note
I am just going to shamelessly copy/paste my earlier text here:
For past week I have been working on script for parsing sample of BF4 players and battlereports for statistical analysis.
pmax did this same a while ago but he mentions the data might not been pure random sample as the users were taken from battlelog forums. Also this data didn't include reports nor some of the variables I would have liked.
So what this script does it first scrapes good bunch of servers and report IDs from
BF4DB, taking max of ~50 reports per server. For every report id it uses a url left by Battlelog devs to get the report data without need of parsing the HTML (Thanks, DICE / whoever made that!), collects the soldier IDs from the report and then uses
BF4Stats API to get soldier data.
@Using Bf4Stats: I just later noticed pmax posted more of those battlelog API urls which have fresh soldier data...
The "rules" I have for selecting reports:
 Only Conquest Large or Conquest
 PC only
 Max starting tickets 1200
 Remove players who had < 1000 combatscore (didn't stay in the game too long)
 Report happened less than 4 months ago
 Only ranked games
 Remove players who had < 5h gameplaytime
The rules are there to avoid having unusual outliers in the sample. Unranked games could have any set of rules, servers with high ticket count can have 300 different players in one game, etcetc.
I had 10k battlereports for analysing and ~40k players. I tried to keep the sample taking as random as possible but I am unsure how bf4stats works with their server listing so there
might be biased to some direction,
however I believe these serve us at least a general picture of the game's underlaying statistics.
For anybody who is interested on having this data just contact me. Currently it's in ugly python pickle files which are bit clumsy so I need to refine it bit.
Also if you have more questions regarding statistics of BF4 just go ahead and ask, I'll see what I can do.
And one more thing: Thank you for being such an awesome community! It is nice to share something like this with you guys
.
Especially big thanks to 3VerstsNorth for being an awesome doctor and researcher in general. You keep inspiring me! pmax also inspired me with his own research and with lot of help and tips he gave to me!