Cluster Analysis Using Hierarchical and K-Means Clustering
Applying two commonly used clustering methods to find subgroups within the FIFA soccer data—using R, Python, and Julia.
Cluster analysis is a data exploration technique used to identify groups or clusters within a dataset based on similarities between observations. It has various use cases, including customer segmentation, image recognition, anomaly detection, and pattern recognition. Common techniques for cluster analysis include hierarchical clustering builds a hierarchy of clusters, agglomerating or dividing clusters based on similarity. K-means clustering, on the other had, partitions data points into k distinct clusters based on their proximity to cluster centroids.
The advantages of hierarchical clustering are its flexibility in determining the number of clusters, ability to capture nested structures, and suitability for small to medium-sized datasets. However, it can be computationally intensive and less scalable. K-means clustering is computationally efficient, scalable to large datasets, but requires specifying the number of clusters and is sensitive to initialization.
Let’s look at how these techniques uncover underlying structures and group similar observations within the FIFA soccer dataset.
Getting Started
If you are interested in reproducing this work, here are the versions of R, Python, and Julia used (as well as the respective packages for each). Additionally, Leland Wilkinson’s approach to data visualization (Grammar of Graphics) has been adopted for this work. Finally, my coding style here is verbose, in order to trace back where functions/methods and variables are originating from, and make this a learning experience for everyone—including me.
VERSION
v"1.9.2"
import Pkg
Pkg.add(name="CSV", version="0.10.4")
Pkg.add(name="DataFrames", version="1.3.6")
Pkg.add(name="CategoricalArrays", version="0.10.7")
Pkg.add(name="Colors", version="0.12.10")
Pkg.add(name="Cairo", version="1.0.5")
Pkg.add(name="Gadfly", version="1.3.4")
Pkg.add(name="Clustering", version="0.15.3")
using CSV
using DataFrames
using CategoricalArrays
using Colors
using Cairo
using Gadfly
using Clustering
import sys
print(sys.version)
3.11.4 (v3.11.4:d2340ef257, Jun 6 2023, 19:15:51) [Clang 13.0.0 (clang-1300.0.29.30)]
!pip install pandas==2.0.0
!pip install plotnine==0.10.1
!pip install scipy==1.7.3
import random
import pandas
import plotnine
import scipy
R.version.string
[1] "R version 4.2.3 (2023-03-15)"
require(devtools)
devtools::install_version("dplyr", version="1.1.1", repos="http://cran.us.r-project.org")
devtools::install_version("ggplot2", version="3.4.2", repos="http://cran.us.r-project.org")
devtools::install_version("mclust", version="6.0.0", repos="http://cran.us.r-project.org")
library(dplyr)
library(ggplot2)
library(mclust)
Importing and Examining Dataset
Upon importing the dataset from Kaggle and examining the dataset, we can see that the data frame dimension is 17,954
rows and 106
columns. A little more than half of the columns are numerical data, and the rest is categorical data. sofifa_id
, player_url
, short_name
, and long_name
are identifiers with almost 100% variability—those will be removed when we get to data wrangling.
fifa_jl = CSV.File("../../dataset/fifa/fifa-2018.csv") |> DataFrames.DataFrame
17954×106 DataFrame
Row │ sofifa_id player_url short_name long_name age dob height_cm weight_kg nationality club_name league_name league_rank overall potential value_eur wage_eur player_positions preferred_foot international_reputation weak_foot skill_moves work_rate body_type real_face release_clause_eur player_tags team_position team_jersey_number loaned_from joined contract_valid_until nation_position nation_jersey_number pace shooting passing dribbling defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed gk_positioning player_traits attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes ls st rs lw lf cf rf rw lam cam ram lm lcm cm rcm rm lwb ldm cdm rdm rwb lb lcb cb rcb rb
│ Int64 String String31 String Int64 Date Int64 Int64 String31 String? String31? Int64? Int64 Int64 Int64 Int64 String31 String7 Int64 Int64 Int64 String15 String15 String3 Int64? String? String3? Int64? String31? Date Int64? String3? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? Int64? String? Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7 String7
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 20801 https://sofifa.com/player/20801/… Cristiano Ronaldo Cristiano Ronaldo dos Santos Ave… 32 1985-02-05 185 80 Portugal Real Madrid Spain Primera Division 1 94 94 95500000 575000 LW, ST Right 5 4 5 High/Low C. Ronaldo Yes 195800000 #Speedster, #Dribbler, #Distance… LW 7 missing 2009-07-01 2021 LS 7 90 93 82 90 33 80 missing missing missing missing missing missing Power Free-Kick, Flair, Long Sho… 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11 92+2 92+2 92+2 91+3 91+3 91+3 91+3 91+3 89+4 89+4 89+4 89+4 82+4 82+4 82+4 89+4 66+4 62+4 62+4 62+4 66+4 61+4 53+4 53+4 53+4 61+4
2 │ 158023 https://sofifa.com/player/158023… L. Messi Lionel Andrés Messi Cuccittini 30 1987-06-24 170 72 Argentina FC Barcelona Spain Primera Division 1 93 93 105000000 575000 RW Left 5 4 4 Medium/Medium Messi Yes 215300000 #Dribbler, #FK Specialist, #Acro… RW 10 missing 2004-07-01 2018 RW 10 89 90 86 96 26 61 missing missing missing missing missing missing Finesse Shot, Long Shot Taker (A… 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8 88+4 88+4 88+4 91+2 92+1 92+1 92+1 91+2 92+1 92+1 92+1 90+3 84+4 84+4 84+4 90+3 62+4 59+4 59+4 59+4 62+4 57+4 45+4 45+4 45+4 57+4
3 │ 190871 https://sofifa.com/player/190871… Neymar Neymar da Silva Santos Júnior 25 1992-02-05 175 68 Brazil Paris Saint-Germain French Ligue 1 1 92 94 123000000 275000 LW Right 5 5 5 High/Medium Neymar Yes 236800000 #Speedster, #Dribbler, #Acrobat LW 10 missing 2017-08-03 2022 LW 10 92 84 79 95 30 60 missing missing missing missing missing missing Diver, Flair, Speed Dribbler (AI… 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11 84+4 84+4 84+4 89+3 88+3 88+3 88+3 89+3 88+4 88+4 88+4 87+4 79+4 79+4 79+4 87+4 64+4 59+4 59+4 59+4 64+4 59+4 46+4 46+4 46+4 59+4
4 │ 167495 https://sofifa.com/player/167495… M. Neuer Manuel Neuer 31 1986-03-27 193 92 Germany FC Bayern München German 1. Bundesliga 1 92 92 61000000 225000 GK Right 5 4 1 Medium/Medium Normal Yes 100700000 missing GK 1 missing 2011-07-01 2021 GK 1 missing missing missing missing missing missing 91 90 95 89 60 91 GK Long Throw, 1-on-1 Rush, Rush… 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89 36+4 36+4 36+4 40+3 41+3 41+3 41+3 40+3 47+4 47+4 47+4 44+4 48+4 48+4 48+4 44+4 36+4 41+4 41+4 41+4 36+4 34+4 33+4 33+4 33+4 34+4
5 │ 176580 https://sofifa.com/player/176580… L. Suárez Luis Alberto Suárez Díaz 30 1987-01-24 182 86 Uruguay FC Barcelona Spain Primera Division 1 92 92 97000000 500000 ST Right 5 4 4 High/Medium Normal Yes 198900000 #Acrobat, #Clinical Finisher ST 9 missing 2014-07-11 2021 LS 9 82 90 79 87 42 81 missing missing missing missing missing missing Diver, Beat Offside Trap, Techni… 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37 88+4 88+4 88+4 87+3 88+3 88+3 88+3 87+3 87+4 87+4 87+4 85+4 80+4 80+4 80+4 85+4 68+4 65+4 65+4 65+4 68+4 64+4 58+4 58+4 58+4 64+4
6 │ 188545 https://sofifa.com/player/188545… R. Lewandowski Robert Lewandowski 28 1988-08-21 185 79 Poland FC Bayern München German 1. Bundesliga 1 91 91 92000000 350000 ST Right 4 4 3 High/Medium Normal Yes 151800000 #Clinical Finisher ST 9 missing 2014-07-01 2021 ST 9 81 88 75 86 38 82 missing missing missing missing missing missing Injury Free, Finesse Shot, Chip … 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10 88+3 88+3 88+3 84+2 87+2 87+2 87+2 84+2 84+3 84+3 84+3 82+3 78+3 78+3 78+3 82+3 61+3 62+3 62+3 62+3 61+3 58+3 57+3 57+3 57+3 58+3
7 │ 193080 https://sofifa.com/player/193080… De Gea David De Gea Quintana 26 1990-11-07 193 76 Spain Manchester United English Premier League 1 90 92 64500000 200000 GK Right 4 3 1 Medium/Medium Lean Yes 124200000 missing GK 1 missing 2011-07-01 2019 GK 1 missing missing missing missing missing missing 90 85 87 90 58 86 GK Long Throw, Saves with Feet 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90 33+2 33+2 33+2 37+1 38+1 38+1 38+1 37+1 43+2 43+2 43+2 40+2 45+2 45+2 45+2 40+2 36+2 41+2 41+2 41+2 36+2 35+2 34+2 34+2 34+2 35+2
8 │ 183277 https://sofifa.com/player/183277… E. Hazard Eden Hazard 26 1991-01-07 173 76 Belgium Chelsea English Premier League 1 90 91 90500000 300000 LW Right 4 4 4 High/Medium Normal Yes 174200000 #Speedster, #Dribbler, #Acrobat LW 10 missing 2012-07-01 2020 LF 10 90 82 84 92 32 66 missing missing missing missing missing missing Beat Offside Trap, Finesse Shot,… 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8 82+3 82+3 82+3 88+2 87+2 87+2 87+2 88+2 88+3 88+3 88+3 87+3 81+3 81+3 81+3 87+3 64+3 61+3 61+3 61+3 64+3 59+3 47+3 47+3 47+3 59+3
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
17948 │ 238308 https://sofifa.com/player/238308… L. Sackey Leslie Sackey 18 1998-11-29 182 72 Ghana Scunthorpe United English League One 3 46 64 50000 1000 CB, ST Left 1 3 2 Medium/Medium Normal No 119000 missing RES 35 missing 2017-05-02 2018 missing missing 49 20 24 30 41 61 missing missing missing missing missing missing missing 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7 31+1 31+1 31+1 29+0 29+0 29+0 29+0 29+0 29+1 29+1 29+1 30+1 30+1 30+1 30+1 30+1 38+1 38+1 38+1 38+1 38+1 40+1 45+1 45+1 45+1 40+1
17949 │ 238813 https://sofifa.com/player/238813… J. Lundstram Josh Lundstram 18 1999-02-19 176 61 England Crewe Alexandra English League Two 4 46 64 60000 2000 CM Right 1 2 2 Medium/Medium Lean No 143000 missing RES 22 missing 2017-05-03 2018 missing missing 58 35 44 45 45 47 missing missing missing missing missing missing missing 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9 41+1 41+1 41+1 44+0 43+0 43+0 43+0 44+0 45+1 45+1 45+1 45+1 45+1 45+1 45+1 45+1 46+1 47+1 47+1 47+1 46+1 47+1 46+1 46+1 46+1 47+1
17950 │ 237463 https://sofifa.com/player/237463… A. Kelsey Adam Kelsey 17 1999-11-12 188 74 England Scunthorpe United English League One 3 46 63 50000 500 GK Right 1 2 1 Medium/Medium Lean No 119000 missing RES 41 missing 2017-01-26 2019 missing missing missing missing missing missing missing missing 46 47 49 48 28 42 missing 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48 16+1 16+1 16+1 17+0 16+0 16+0 16+0 17+0 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 17+1 17+1 17+1 17+1 17+1
17951 │ 231381 https://sofifa.com/player/231381… J. Young Jordan Young 17 1999-07-31 175 71 Scotland Swindon Town English League Two 4 46 61 60000 2000 ST Left 1 2 2 Medium/Medium Lean No 143000 missing SUB 26 missing 2015-10-17 2019 missing missing 58 47 35 43 20 33 missing missing missing missing missing missing missing 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11 45+1 45+1 45+1 44+0 45+0 45+0 45+0 44+0 44+1 44+1 44+1 42+1 38+1 38+1 38+1 42+1 32+1 29+1 29+1 29+1 32+1 31+1 28+1 28+1 28+1 31+1
17952 │ 240404 https://sofifa.com/player/240404… J. Keeble Jack Keeble 18 1999-03-22 172 66 England Grimsby Town English League Two 4 46 56 40000 1000 CB Right 1 2 2 Low/Medium Lean No 78000 missing SUB 24 missing 2017-06-30 2018 missing missing 63 20 29 34 46 45 missing missing missing missing missing missing missing 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11 33+1 33+1 33+1 34+0 33+0 33+0 33+0 34+0 32+1 32+1 32+1 35+1 34+1 34+1 34+1 35+1 44+1 41+1 41+1 41+1 44+1 46+1 45+1 45+1 45+1 46+1
17953 │ 11728 https://sofifa.com/player/11728/… B. Richardson Barry Richardson 47 1969-08-05 185 77 England Wycombe Wanderers English League Two 4 46 46 2000 1000 GK Right 1 2 1 Medium/Medium Stocky No missing missing SUB 13 missing 2014-01-30 2022 missing missing missing missing missing missing missing missing 39 50 39 37 25 50 missing 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37 19+1 19+1 19+1 19+0 19+0 19+0 19+0 19+0 19+1 19+1 19+1 19+1 19+1 19+1 19+1 19+1 20+1 21+1 21+1 21+1 20+1 20+1 22+1 22+1 22+1 20+1
17954 │ 235352 https://sofifa.com/player/235352… T. Käßemodel Tommy Käßemodel 28 1988-08-09 173 75 Germany FC Erzgebirge Aue German 2. Bundesliga 2 46 46 30000 2000 CM Right 1 3 2 Medium/Medium Stocky No 47000 missing RES 29 missing 2016-07-01 2018 missing missing 23 42 48 45 36 38 missing missing missing missing missing missing missing 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6 41+1 41+1 41+1 41+0 42+0 42+0 42+0 41+0 44+1 44+1 44+1 42+1 45+1 45+1 45+1 42+1 38+1 42+1 42+1 42+1 38+1 37+1 38+1 38+1 38+1 37+1
17939 rows omitted
fifa_py = pandas.read_csv("../../dataset/fifa/fifa-2018.csv")
fifa_py.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17954 entries, 0 to 17953
Columns: 106 entries, sofifa_id to rb
dtypes: float64(17), int64(45), object(44)
memory usage: 14.5+ MB
fifa_py.head(n=8)
sofifa_id player_url short_name long_name age dob height_cm weight_kg nationality club_name league_name league_rank overall potential value_eur wage_eur player_positions preferred_foot international_reputation weak_foot skill_moves work_rate body_type real_face release_clause_eur player_tags team_position team_jersey_number loaned_from joined contract_valid_until nation_position nation_jersey_number pace shooting passing dribbling defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed gk_positioning player_traits attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes ls st rs lw lf cf rf rw lam cam ram lm lcm cm rcm rm lwb ldm cdm rdm rwb lb lcb cb rcb rb
0 20801 https://sofifa.com/player/20801/c-ronaldo-dos-santos-aveiro/180002 Cristiano Ronaldo Cristiano Ronaldo dos Santos Aveiro 32 1985-02-05 185 80 Portugal Real Madrid Spain Primera Division 1.0 94 94 95500000 575000 LW, ST Right 5 4 5 High/Low C. Ronaldo Yes 195800000.0 #Speedster, #Dribbler, #Distance Shooter, #Acrobat, #Clinical Finisher, #Complete Forward LW 7.0 NaN 2009-07-01 2021.0 LS 7.0 90.0 93.0 82.0 90.0 33.0 80.0 NaN NaN NaN NaN NaN NaN Power Free-Kick, Flair, Long Shot Taker (AI), Speed Dribbler (AI), Through Ball 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11 92+2 92+2 92+2 91+3 91+3 91+3 91+3 91+3 89+4 89+4 89+4 89+4 82+4 82+4 82+4 89+4 66+4 62+4 62+4 62+4 66+4 61+4 53+4 53+4 53+4 61+4
1 158023 https://sofifa.com/player/158023/lionel-messi/180002 L. Messi Lionel Andrés Messi Cuccittini 30 1987-06-24 170 72 Argentina FC Barcelona Spain Primera Division 1.0 93 93 105000000 575000 RW Left 5 4 4 Medium/Medium Messi Yes 215300000.0 #Dribbler, #FK Specialist, #Acrobat, #Clinical Finisher RW 10.0 NaN 2004-07-01 2018.0 RW 10.0 89.0 90.0 86.0 96.0 26.0 61.0 NaN NaN NaN NaN NaN NaN Finesse Shot, Long Shot Taker (AI), Speed Dribbler (AI), Playmaker (AI), One Club Player, Chip Shot (AI) 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8 88+4 88+4 88+4 91+2 92+1 92+1 92+1 91+2 92+1 92+1 92+1 90+3 84+4 84+4 84+4 90+3 62+4 59+4 59+4 59+4 62+4 57+4 45+4 45+4 45+4 57+4
2 190871 https://sofifa.com/player/190871/neymar-da-silva-santos-jr/180002 Neymar Neymar da Silva Santos Júnior 25 1992-02-05 175 68 Brazil Paris Saint-Germain French Ligue 1 1.0 92 94 123000000 275000 LW Right 5 5 5 High/Medium Neymar Yes 236800000.0 #Speedster, #Dribbler, #Acrobat LW 10.0 NaN 2017-08-03 2022.0 LW 10.0 92.0 84.0 79.0 95.0 30.0 60.0 NaN NaN NaN NaN NaN NaN Diver, Flair, Speed Dribbler (AI), Technical Dribbler (AI), Takes Finesse Free Kicks 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11 84+4 84+4 84+4 89+3 88+3 88+3 88+3 89+3 88+4 88+4 88+4 87+4 79+4 79+4 79+4 87+4 64+4 59+4 59+4 59+4 64+4 59+4 46+4 46+4 46+4 59+4
3 167495 https://sofifa.com/player/167495/manuel-neuer/180002 M. Neuer Manuel Neuer 31 1986-03-27 193 92 Germany FC Bayern München German 1. Bundesliga 1.0 92 92 61000000 225000 GK Right 5 4 1 Medium/Medium Normal Yes 100700000.0 NaN GK 1.0 NaN 2011-07-01 2021.0 GK 1.0 NaN NaN NaN NaN NaN NaN 91.0 90.0 95.0 89.0 60.0 91.0 GK Long Throw, 1-on-1 Rush, Rushes Out Of Goal, Comes For Crosses 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89 36+4 36+4 36+4 40+3 41+3 41+3 41+3 40+3 47+4 47+4 47+4 44+4 48+4 48+4 48+4 44+4 36+4 41+4 41+4 41+4 36+4 34+4 33+4 33+4 33+4 34+4
4 176580 https://sofifa.com/player/176580/luis-suarez/180002 L. Suárez Luis Alberto Suárez Díaz 30 1987-01-24 182 86 Uruguay FC Barcelona Spain Primera Division 1.0 92 92 97000000 500000 ST Right 5 4 4 High/Medium Normal Yes 198900000.0 #Acrobat, #Clinical Finisher ST 9.0 NaN 2014-07-11 2021.0 LS 9.0 82.0 90.0 79.0 87.0 42.0 81.0 NaN NaN NaN NaN NaN NaN Diver, Beat Offside Trap, Technical Dribbler (AI) 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37 88+4 88+4 88+4 87+3 88+3 88+3 88+3 87+3 87+4 87+4 87+4 85+4 80+4 80+4 80+4 85+4 68+4 65+4 65+4 65+4 68+4 64+4 58+4 58+4 58+4 64+4
5 188545 https://sofifa.com/player/188545/robert-lewandowski/180002 R. Lewandowski Robert Lewandowski 28 1988-08-21 185 79 Poland FC Bayern München German 1. Bundesliga 1.0 91 91 92000000 350000 ST Right 4 4 3 High/Medium Normal Yes 151800000.0 #Clinical Finisher ST 9.0 NaN 2014-07-01 2021.0 ST 9.0 81.0 88.0 75.0 86.0 38.0 82.0 NaN NaN NaN NaN NaN NaN Injury Free, Finesse Shot, Chip Shot (AI) 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10 88+3 88+3 88+3 84+2 87+2 87+2 87+2 84+2 84+3 84+3 84+3 82+3 78+3 78+3 78+3 82+3 61+3 62+3 62+3 62+3 61+3 58+3 57+3 57+3 57+3 58+3
6 193080 https://sofifa.com/player/193080/david-de-gea-quintana/180002 De Gea David De Gea Quintana 26 1990-11-07 193 76 Spain Manchester United English Premier League 1.0 90 92 64500000 200000 GK Right 4 3 1 Medium/Medium Lean Yes 124200000.0 NaN GK 1.0 NaN 2011-07-01 2019.0 GK 1.0 NaN NaN NaN NaN NaN NaN 90.0 85.0 87.0 90.0 58.0 86.0 GK Long Throw, Saves with Feet 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90 33+2 33+2 33+2 37+1 38+1 38+1 38+1 37+1 43+2 43+2 43+2 40+2 45+2 45+2 45+2 40+2 36+2 41+2 41+2 41+2 36+2 35+2 34+2 34+2 34+2 35+2
7 183277 https://sofifa.com/player/183277/eden-hazard/180002 E. Hazard Eden Hazard 26 1991-01-07 173 76 Belgium Chelsea English Premier League 1.0 90 91 90500000 300000 LW Right 4 4 4 High/Medium Normal Yes 174200000.0 #Speedster, #Dribbler, #Acrobat LW 10.0 NaN 2012-07-01 2020.0 LF 10.0 90.0 82.0 84.0 92.0 32.0 66.0 NaN NaN NaN NaN NaN NaN Beat Offside Trap, Finesse Shot, Flair, Playmaker (AI), Technical Dribbler (AI) 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8 82+3 82+3 82+3 88+2 87+2 87+2 87+2 88+2 88+3 88+3 88+3 87+3 81+3 81+3 81+3 87+3 64+3 61+3 61+3 61+3 64+3 59+3 47+3 47+3 47+3 59+3
fifa_py.tail(n=8)
sofifa_id player_url short_name long_name age dob height_cm weight_kg nationality club_name league_name league_rank overall potential value_eur wage_eur player_positions preferred_foot international_reputation weak_foot skill_moves work_rate body_type real_face release_clause_eur player_tags team_position team_jersey_number loaned_from joined contract_valid_until nation_position nation_jersey_number pace shooting passing dribbling defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed gk_positioning player_traits attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes ls st rs lw lf cf rf rw lam cam ram lm lcm cm rcm rm lwb ldm cdm rdm rwb lb lcb cb rcb rb
17946 234733 https://sofifa.com/player/234733/max-wright/180002 M. Wright Max Wright 19 1998-04-06 170 76 England Grimsby Town English League Two 4.0 46 64 60000 2000 RW, RB, LB Right 1 3 3 Medium/Medium Lean No 143000.0 NaN RES 31.0 NaN 2016-07-01 2018.0 NaN NaN 70.0 33.0 39.0 48.0 34.0 47.0 NaN NaN NaN NaN NaN NaN NaN 42 36 38 40 25 47 32 23 38 40 71 70 69 31 71 25 48 50 53 34 28 32 46 43 20 45 39 30 30 5 8 14 14 9 41+1 41+1 41+1 46+0 43+0 43+0 43+0 46+0 44+1 44+1 44+1 45+1 40+1 40+1 40+1 45+1 41+1 37+1 37+1 37+1 41+1 40+1 37+1 37+1 37+1 40+1
17947 238308 https://sofifa.com/player/238308/leslie-sackey/180002 L. Sackey Leslie Sackey 18 1998-11-29 182 72 Ghana Scunthorpe United English League One 3.0 46 64 50000 1000 CB, ST Left 1 3 2 Medium/Medium Normal No 119000.0 NaN RES 35.0 NaN 2017-05-02 2018.0 NaN NaN 49.0 20.0 24.0 30.0 41.0 61.0 NaN NaN NaN NaN NaN NaN NaN 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7 31+1 31+1 31+1 29+0 29+0 29+0 29+0 29+0 29+1 29+1 29+1 30+1 30+1 30+1 30+1 30+1 38+1 38+1 38+1 38+1 38+1 40+1 45+1 45+1 45+1 40+1
17948 238813 https://sofifa.com/player/238813/josh-lundstram/180002 J. Lundstram Josh Lundstram 18 1999-02-19 176 61 England Crewe Alexandra English League Two 4.0 46 64 60000 2000 CM Right 1 2 2 Medium/Medium Lean No 143000.0 NaN RES 22.0 NaN 2017-05-03 2018.0 NaN NaN 58.0 35.0 44.0 45.0 45.0 47.0 NaN NaN NaN NaN NaN NaN NaN 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9 41+1 41+1 41+1 44+0 43+0 43+0 43+0 44+0 45+1 45+1 45+1 45+1 45+1 45+1 45+1 45+1 46+1 47+1 47+1 47+1 46+1 47+1 46+1 46+1 46+1 47+1
17949 237463 https://sofifa.com/player/237463/adam-kelsey/180002 A. Kelsey Adam Kelsey 17 1999-11-12 188 74 England Scunthorpe United English League One 3.0 46 63 50000 500 GK Right 1 2 1 Medium/Medium Lean No 119000.0 NaN RES 41.0 NaN 2017-01-26 2019.0 NaN NaN NaN NaN NaN NaN NaN NaN 46.0 47.0 49.0 48.0 28.0 42.0 NaN 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48 16+1 16+1 16+1 17+0 16+0 16+0 16+0 17+0 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 17+1 17+1 17+1 17+1 17+1
17950 231381 https://sofifa.com/player/231381/jordan-young/180002 J. Young Jordan Young 17 1999-07-31 175 71 Scotland Swindon Town English League Two 4.0 46 61 60000 2000 ST Left 1 2 2 Medium/Medium Lean No 143000.0 NaN SUB 26.0 NaN 2015-10-17 2019.0 NaN NaN 58.0 47.0 35.0 43.0 20.0 33.0 NaN NaN NaN NaN NaN NaN NaN 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11 45+1 45+1 45+1 44+0 45+0 45+0 45+0 44+0 44+1 44+1 44+1 42+1 38+1 38+1 38+1 42+1 32+1 29+1 29+1 29+1 32+1 31+1 28+1 28+1 28+1 31+1
17951 240404 https://sofifa.com/player/240404/jack-keeble/180002 J. Keeble Jack Keeble 18 1999-03-22 172 66 England Grimsby Town English League Two 4.0 46 56 40000 1000 CB Right 1 2 2 Low/Medium Lean No 78000.0 NaN SUB 24.0 NaN 2017-06-30 2018.0 NaN NaN 63.0 20.0 29.0 34.0 46.0 45.0 NaN NaN NaN NaN NaN NaN NaN 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11 33+1 33+1 33+1 34+0 33+0 33+0 33+0 34+0 32+1 32+1 32+1 35+1 34+1 34+1 34+1 35+1 44+1 41+1 41+1 41+1 44+1 46+1 45+1 45+1 45+1 46+1
17952 11728 https://sofifa.com/player/11728/barry-richardson/180002 B. Richardson Barry Richardson 47 1969-08-05 185 77 England Wycombe Wanderers English League Two 4.0 46 46 2000 1000 GK Right 1 2 1 Medium/Medium Stocky No NaN NaN SUB 13.0 NaN 2014-01-30 2022.0 NaN NaN NaN NaN NaN NaN NaN NaN 39.0 50.0 39.0 37.0 25.0 50.0 NaN 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37 19+1 19+1 19+1 19+0 19+0 19+0 19+0 19+0 19+1 19+1 19+1 19+1 19+1 19+1 19+1 19+1 20+1 21+1 21+1 21+1 20+1 20+1 22+1 22+1 22+1 20+1
17953 235352 https://sofifa.com/player/235352/tommy-kassemodel/180002 T. Käßemodel Tommy Käßemodel 28 1988-08-09 173 75 Germany FC Erzgebirge Aue German 2. Bundesliga 2.0 46 46 30000 2000 CM Right 1 3 2 Medium/Medium Stocky No 47000.0 NaN RES 29.0 NaN 2016-07-01 2018.0 NaN NaN 23.0 42.0 48.0 45.0 36.0 38.0 NaN NaN NaN NaN NaN NaN NaN 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6 41+1 41+1 41+1 41+0 42+0 42+0 42+0 41+0 44+1 44+1 44+1 42+1 45+1 45+1 45+1 42+1 38+1 42+1 42+1 42+1 38+1 37+1 38+1 38+1 38+1 37+1
fifa_r <- read.csv("../../dataset/fifa/fifa-2018.csv", stringsAsFactors=TRUE)
str(object=fifa_r)
'data.frame': 17954 obs. of 106 variables:
$ sofifa_id : int 20801 158023 190871 167495 176580 188545 193080 183277 155862 167664 ...
$ player_url : Factor w/ 17954 levels "https://sofifa.com/player/100557/brian-barry-murphy/180002",..: 7239 790 4067 1541 2145 3543 4443 2772 683 1564 ...
$ short_name : Factor w/ 16995 levels "A. Abbas","A. Abdallah",..: 3194 9639 12427 11157 9846 13786 4198 4504 15377 5803 ...
$ long_name : Factor w/ 17898 levels "A. Benjamin Chiamuloira Paes",..: 3348 9912 12492 10600 10181 14205 3831 4499 15170 6279 ...
$ age : int 32 30 25 31 30 28 26 26 31 29 ...
$ dob : Factor w/ 6019 levels "1969-08-05","1973-01-15",..: 966 1696 3278 1302 1557 2077 2842 2898 1305 1840 ...
$ height_cm : int 185 170 175 193 182 185 193 173 183 184 ...
$ weight_kg : int 80 72 68 92 86 79 76 76 75 87 ...
$ nationality : Factor w/ 165 levels "Afghanistan",..: 122 6 19 58 159 121 141 13 141 6 ...
$ club_name : Factor w/ 648 levels "","1. FC Heidenheim 1846",..: 470 214 433 217 214 217 375 136 470 331 ...
$ league_name : Factor w/ 42 levels "","Argentina Primera División",..: 36 36 16 18 36 18 14 14 36 23 ...
$ league_rank : int 1 1 1 1 1 1 1 1 1 1 ...
$ overall : int 94 93 92 92 92 91 90 90 90 90 ...
$ potential : int 94 93 94 92 92 91 92 91 90 90 ...
$ value_eur : int 95500000 105000000 123000000 61000000 97000000 92000000 64500000 90500000 52000000 77000000 ...
$ wage_eur : int 575000 575000 275000 225000 500000 350000 200000 300000 300000 275000 ...
$ player_positions : Factor w/ 802 levels "CAM","CAM, CDM",..: 485 653 442 293 731 731 293 442 95 731 ...
$ preferred_foot : Factor w/ 2 levels "Left","Right": 2 1 2 2 2 2 2 2 2 2 ...
$ international_reputation : int 5 5 5 5 5 4 4 4 4 4 ...
$ weak_foot : int 4 4 5 4 4 4 3 4 3 4 ...
$ skill_moves : int 5 4 5 1 4 3 1 4 3 3 ...
$ work_rate : Factor w/ 9 levels "High/High","High/Low",..: 2 9 3 9 3 3 9 3 3 3 ...
$ body_type : Factor w/ 9 levels "Akinfenwa","C. Ronaldo",..: 2 5 6 7 7 7 4 7 7 7 ...
$ real_face : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
$ release_clause_eur : int 195800000 215300000 236800000 100700000 198900000 151800000 124200000 174200000 106600000 127100000 ...
$ player_tags : Factor w/ 72 levels "","#Acrobat",..: 61 27 60 1 3 11 1 60 8 11 ...
$ team_position : Factor w/ 30 levels "","CAM","CB",..: 16 27 16 7 29 29 7 16 10 29 ...
$ team_jersey_number : int 7 10 10 1 9 9 1 10 4 9 ...
$ loaned_from : Factor w/ 315 levels "","1. FC Kaiserslautern",..: 1 1 1 1 1 1 1 1 1 1 ...
$ joined : Factor w/ 1600 levels "","1991-06-01",..: 102 22 1561 176 612 602 176 242 31 1225 ...
$ contract_valid_until : int 2021 2018 2022 2021 2021 2021 2019 2020 2020 2021 ...
$ nation_position : Factor w/ 29 levels "","CAM","CB",..: 15 26 16 7 15 28 7 13 10 1 ...
$ nation_jersey_number : int 7 10 10 1 9 9 1 10 15 NA ...
$ pace : int 90 89 92 NA 82 81 NA 90 76 79 ...
$ shooting : int 93 90 84 NA 90 88 NA 82 63 87 ...
$ passing : int 82 86 79 NA 79 75 NA 84 71 70 ...
$ dribbling : int 90 96 95 NA 87 86 NA 92 71 83 ...
$ defending : int 33 26 30 NA 42 38 NA 32 88 25 ...
$ physic : int 80 61 60 NA 81 82 NA 66 83 74 ...
$ gk_diving : int NA NA NA 91 NA NA 90 NA NA NA ...
$ gk_handling : int NA NA NA 90 NA NA 85 NA NA NA ...
$ gk_kicking : int NA NA NA 95 NA NA 87 NA NA NA ...
$ gk_reflexes : int NA NA NA 89 NA NA 90 NA NA NA ...
$ gk_speed : int NA NA NA 60 NA NA 58 NA NA NA ...
$ gk_positioning : int NA NA NA 91 NA NA 86 NA NA NA ...
$ player_traits : Factor w/ 1515 levels "","Avoids Using Weaker Foot",..: 1304 501 295 617 262 697 622 152 384 1248 ...
$ attacking_crossing : int 85 77 75 15 77 62 17 80 66 68 ...
$ attacking_finishing : int 94 95 89 13 94 91 13 83 60 91 ...
$ attacking_heading_accuracy: int 88 71 62 25 77 85 21 57 91 86 ...
$ attacking_short_passing : int 83 88 81 55 83 83 50 86 78 75 ...
$ attacking_volleys : int 88 85 83 11 88 87 13 79 66 88 ...
$ skill_dribbling : int 91 97 96 30 86 85 18 93 61 84 ...
$ skill_curve : int 81 89 81 14 86 77 21 82 73 74 ...
$ skill_fk_accuracy : int 76 90 84 11 84 84 19 79 67 62 ...
$ skill_long_passing : int 77 87 75 59 64 65 51 81 72 59 ...
$ skill_ball_control : int 93 95 95 48 91 89 42 92 84 85 ...
$ movement_acceleration : int 89 92 94 58 88 79 57 93 75 78 ...
$ movement_sprint_speed : int 91 87 90 61 77 83 58 87 77 80 ...
$ movement_agility : int 89 90 96 52 86 78 60 93 79 75 ...
$ movement_reactions : int 96 95 88 85 93 91 88 85 85 88 ...
$ movement_balance : int 63 95 82 35 60 80 43 91 60 69 ...
$ power_shot_power : int 94 85 80 25 87 88 31 79 79 88 ...
$ power_jumping : int 95 68 61 78 69 84 67 59 93 79 ...
$ power_stamina : int 92 73 78 44 89 79 40 79 84 72 ...
$ power_strength : int 80 59 53 83 80 84 64 65 81 85 ...
$ power_long_shots : int 92 88 77 16 86 83 12 82 55 82 ...
$ mentality_aggression : int 63 48 56 29 78 80 38 54 84 50 ...
$ mentality_interceptions : int 29 22 36 30 41 39 30 41 88 20 ...
$ mentality_positioning : int 95 93 90 12 92 91 12 85 52 92 ...
$ mentality_vision : int 85 90 80 70 84 78 68 86 63 70 ...
$ mentality_penalties : int 85 74 81 47 85 81 40 86 68 70 ...
$ mentality_composure : int 95 96 92 70 83 87 64 87 80 86 ...
$ defending_marking : int 22 13 21 10 30 25 13 25 86 12 ...
$ defending_standing_tackle : int 31 28 24 10 45 42 21 27 89 22 ...
$ defending_sliding_tackle : int 23 26 33 11 38 19 13 22 91 18 ...
$ goalkeeping_diving : int 7 6 9 91 27 15 90 11 11 5 ...
$ goalkeeping_handling : int 11 11 9 90 25 6 85 12 8 12 ...
$ goalkeeping_kicking : int 15 15 15 95 31 12 87 6 9 7 ...
$ goalkeeping_positioning : int 14 14 15 91 33 8 86 8 7 5 ...
$ goalkeeping_reflexes : int 11 8 11 89 37 10 90 8 11 10 ...
$ ls : Factor w/ 192 levels "14+1","15+1",..: 192 191 185 38 191 190 34 178 140 189 ...
$ st : Factor w/ 192 levels "14+1","15+1",..: 192 191 185 38 191 190 34 178 140 189 ...
$ rs : Factor w/ 192 levels "14+1","15+1",..: 192 191 185 38 191 190 34 178 140 189 ...
$ lw : Factor w/ 180 levels "14+0","15+0",..: 180 179 178 40 176 167 36 177 113 161 ...
$ lf : Factor w/ 172 levels "14+0","15+0",..: 171 172 170 42 170 169 38 169 113 162 ...
$ cf : Factor w/ 172 levels "14+0","15+0",..: 171 172 170 42 170 169 38 169 113 162 ...
$ rf : Factor w/ 172 levels "14+0","15+0",..: 171 172 170 42 170 169 38 169 113 162 ...
$ rw : Factor w/ 180 levels "14+0","15+0",..: 180 179 178 40 176 167 36 177 113 161 ...
$ lam : Factor w/ 214 levels "16+1","17+1",..: 213 214 212 50 210 201 42 211 133 186 ...
$ cam : Factor w/ 214 levels "16+1","17+1",..: 213 214 212 50 210 201 42 211 133 186 ...
$ ram : Factor w/ 214 levels "16+1","17+1",..: 213 214 212 50 210 201 42 211 133 186 ...
$ lm : Factor w/ 205 levels "15+1","16+1",..: 204 205 203 46 199 191 41 202 137 177 ...
$ lcm : Factor w/ 189 levels "16+1","17+1",..: 179 185 167 50 172 160 46 176 140 125 ...
$ cm : Factor w/ 189 levels "16+1","17+1",..: 179 185 167 50 172 160 46 176 140 125 ...
$ rcm : Factor w/ 189 levels "16+1","17+1",..: 179 185 167 50 172 160 46 176 140 125 ...
$ rm : Factor w/ 205 levels "15+1","16+1",..: 204 205 203 46 199 191 41 202 137 177 ...
$ lwb : Factor w/ 203 levels "14+1","15+1",..: 124 102 113 38 134 97 37 112 195 70 ...
$ ldm : Factor w/ 207 levels "16+1","17+1",..: 106 89 89 42 122 105 41 100 203 62 ...
$ cdm : Factor w/ 207 levels "16+1","17+1",..: 106 89 89 42 122 105 41 100 203 62 ...
[list output truncated]
head(x=fifa_r, n=8)
sofifa_id player_url short_name long_name age dob height_cm weight_kg nationality club_name league_name league_rank overall potential value_eur wage_eur player_positions preferred_foot international_reputation weak_foot skill_moves work_rate body_type real_face release_clause_eur player_tags team_position team_jersey_number loaned_from joined contract_valid_until nation_position nation_jersey_number pace shooting passing dribbling defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed gk_positioning player_traits attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes ls st rs lw lf cf rf rw lam cam ram lm lcm cm rcm rm lwb ldm cdm rdm rwb lb lcb cb rcb rb
1 20801 https://sofifa.com/player/20801/c-ronaldo-dos-santos-aveiro/180002 Cristiano Ronaldo Cristiano Ronaldo dos Santos Aveiro 32 1985-02-05 185 80 Portugal Real Madrid Spain Primera Division 1 94 94 95500000 575000 LW, ST Right 5 4 5 High/Low C. Ronaldo Yes 195800000 #Speedster, #Dribbler, #Distance Shooter, #Acrobat, #Clinical Finisher, #Complete Forward LW 7 2009-07-01 2021 LS 7 90 93 82 90 33 80 NA NA NA NA NA NA Power Free-Kick, Flair, Long Shot Taker (AI), Speed Dribbler (AI), Through Ball 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11 92+2 92+2 92+2 91+3 91+3 91+3 91+3 91+3 89+4 89+4 89+4 89+4 82+4 82+4 82+4 89+4 66+4 62+4 62+4 62+4 66+4 61+4 53+4 53+4 53+4 61+4
2 158023 https://sofifa.com/player/158023/lionel-messi/180002 L. Messi Lionel Andrés Messi Cuccittini 30 1987-06-24 170 72 Argentina FC Barcelona Spain Primera Division 1 93 93 105000000 575000 RW Left 5 4 4 Medium/Medium Messi Yes 215300000 #Dribbler, #FK Specialist, #Acrobat, #Clinical Finisher RW 10 2004-07-01 2018 RW 10 89 90 86 96 26 61 NA NA NA NA NA NA Finesse Shot, Long Shot Taker (AI), Speed Dribbler (AI), Playmaker (AI), One Club Player, Chip Shot (AI) 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8 88+4 88+4 88+4 91+2 92+1 92+1 92+1 91+2 92+1 92+1 92+1 90+3 84+4 84+4 84+4 90+3 62+4 59+4 59+4 59+4 62+4 57+4 45+4 45+4 45+4 57+4
3 190871 https://sofifa.com/player/190871/neymar-da-silva-santos-jr/180002 Neymar Neymar da Silva Santos Júnior 25 1992-02-05 175 68 Brazil Paris Saint-Germain French Ligue 1 1 92 94 123000000 275000 LW Right 5 5 5 High/Medium Neymar Yes 236800000 #Speedster, #Dribbler, #Acrobat LW 10 2017-08-03 2022 LW 10 92 84 79 95 30 60 NA NA NA NA NA NA Diver, Flair, Speed Dribbler (AI), Technical Dribbler (AI), Takes Finesse Free Kicks 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11 84+4 84+4 84+4 89+3 88+3 88+3 88+3 89+3 88+4 88+4 88+4 87+4 79+4 79+4 79+4 87+4 64+4 59+4 59+4 59+4 64+4 59+4 46+4 46+4 46+4 59+4
4 167495 https://sofifa.com/player/167495/manuel-neuer/180002 M. Neuer Manuel Neuer 31 1986-03-27 193 92 Germany FC Bayern München German 1. Bundesliga 1 92 92 61000000 225000 GK Right 5 4 1 Medium/Medium Normal Yes 100700000 GK 1 2011-07-01 2021 GK 1 NA NA NA NA NA NA 91 90 95 89 60 91 GK Long Throw, 1-on-1 Rush, Rushes Out Of Goal, Comes For Crosses 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89 36+4 36+4 36+4 40+3 41+3 41+3 41+3 40+3 47+4 47+4 47+4 44+4 48+4 48+4 48+4 44+4 36+4 41+4 41+4 41+4 36+4 34+4 33+4 33+4 33+4 34+4
5 176580 https://sofifa.com/player/176580/luis-suarez/180002 L. Suárez Luis Alberto Suárez Díaz 30 1987-01-24 182 86 Uruguay FC Barcelona Spain Primera Division 1 92 92 97000000 500000 ST Right 5 4 4 High/Medium Normal Yes 198900000 #Acrobat, #Clinical Finisher ST 9 2014-07-11 2021 LS 9 82 90 79 87 42 81 NA NA NA NA NA NA Diver, Beat Offside Trap, Technical Dribbler (AI) 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37 88+4 88+4 88+4 87+3 88+3 88+3 88+3 87+3 87+4 87+4 87+4 85+4 80+4 80+4 80+4 85+4 68+4 65+4 65+4 65+4 68+4 64+4 58+4 58+4 58+4 64+4
6 188545 https://sofifa.com/player/188545/robert-lewandowski/180002 R. Lewandowski Robert Lewandowski 28 1988-08-21 185 79 Poland FC Bayern München German 1. Bundesliga 1 91 91 92000000 350000 ST Right 4 4 3 High/Medium Normal Yes 151800000 #Clinical Finisher ST 9 2014-07-01 2021 ST 9 81 88 75 86 38 82 NA NA NA NA NA NA Injury Free, Finesse Shot, Chip Shot (AI) 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10 88+3 88+3 88+3 84+2 87+2 87+2 87+2 84+2 84+3 84+3 84+3 82+3 78+3 78+3 78+3 82+3 61+3 62+3 62+3 62+3 61+3 58+3 57+3 57+3 57+3 58+3
7 193080 https://sofifa.com/player/193080/david-de-gea-quintana/180002 De Gea David De Gea Quintana 26 1990-11-07 193 76 Spain Manchester United English Premier League 1 90 92 64500000 200000 GK Right 4 3 1 Medium/Medium Lean Yes 124200000 GK 1 2011-07-01 2019 GK 1 NA NA NA NA NA NA 90 85 87 90 58 86 GK Long Throw, Saves with Feet 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90 33+2 33+2 33+2 37+1 38+1 38+1 38+1 37+1 43+2 43+2 43+2 40+2 45+2 45+2 45+2 40+2 36+2 41+2 41+2 41+2 36+2 35+2 34+2 34+2 34+2 35+2
8 183277 https://sofifa.com/player/183277/eden-hazard/180002 E. Hazard Eden Hazard 26 1991-01-07 173 76 Belgium Chelsea English Premier League 1 90 91 90500000 300000 LW Right 4 4 4 High/Medium Normal Yes 174200000 #Speedster, #Dribbler, #Acrobat LW 10 2012-07-01 2020 LF 10 90 82 84 92 32 66 NA NA NA NA NA NA Beat Offside Trap, Finesse Shot, Flair, Playmaker (AI), Technical Dribbler (AI) 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8 82+3 82+3 82+3 88+2 87+2 87+2 87+2 88+2 88+3 88+3 88+3 87+3 81+3 81+3 81+3 87+3 64+3 61+3 61+3 61+3 64+3 59+3 47+3 47+3 47+3 59+3
tail(x=fifa_r, n=8)
sofifa_id player_url short_name long_name age dob height_cm weight_kg nationality club_name league_name league_rank overall potential value_eur wage_eur player_positions preferred_foot international_reputation weak_foot skill_moves work_rate body_type real_face release_clause_eur player_tags team_position team_jersey_number loaned_from joined contract_valid_until nation_position nation_jersey_number pace shooting passing dribbling defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed gk_positioning player_traits attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes ls st rs lw lf cf rf rw lam cam ram lm lcm cm rcm rm lwb ldm cdm rdm rwb lb lcb cb rcb rb
17947 234733 https://sofifa.com/player/234733/max-wright/180002 M. Wright Max Wright 19 1998-04-06 170 76 England Grimsby Town English League Two 4 46 64 60000 2000 RW, RB, LB Right 1 3 3 Medium/Medium Lean No 143000 RES 31 2016-07-01 2018 NA 70 33 39 48 34 47 NA NA NA NA NA NA 42 36 38 40 25 47 32 23 38 40 71 70 69 31 71 25 48 50 53 34 28 32 46 43 20 45 39 30 30 5 8 14 14 9 41+1 41+1 41+1 46+0 43+0 43+0 43+0 46+0 44+1 44+1 44+1 45+1 40+1 40+1 40+1 45+1 41+1 37+1 37+1 37+1 41+1 40+1 37+1 37+1 37+1 40+1
17948 238308 https://sofifa.com/player/238308/leslie-sackey/180002 L. Sackey Leslie Sackey 18 1998-11-29 182 72 Ghana Scunthorpe United English League One 3 46 64 50000 1000 CB, ST Left 1 3 2 Medium/Medium Normal No 119000 RES 35 2017-05-02 2018 NA 49 20 24 30 41 61 NA NA NA NA NA NA 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7 31+1 31+1 31+1 29+0 29+0 29+0 29+0 29+0 29+1 29+1 29+1 30+1 30+1 30+1 30+1 30+1 38+1 38+1 38+1 38+1 38+1 40+1 45+1 45+1 45+1 40+1
17949 238813 https://sofifa.com/player/238813/josh-lundstram/180002 J. Lundstram Josh Lundstram 18 1999-02-19 176 61 England Crewe Alexandra English League Two 4 46 64 60000 2000 CM Right 1 2 2 Medium/Medium Lean No 143000 RES 22 2017-05-03 2018 NA 58 35 44 45 45 47 NA NA NA NA NA NA 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9 41+1 41+1 41+1 44+0 43+0 43+0 43+0 44+0 45+1 45+1 45+1 45+1 45+1 45+1 45+1 45+1 46+1 47+1 47+1 47+1 46+1 47+1 46+1 46+1 46+1 47+1
17950 237463 https://sofifa.com/player/237463/adam-kelsey/180002 A. Kelsey Adam Kelsey 17 1999-11-12 188 74 England Scunthorpe United English League One 3 46 63 50000 500 GK Right 1 2 1 Medium/Medium Lean No 119000 RES 41 2017-01-26 2019 NA NA NA NA NA NA NA 46 47 49 48 28 42 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48 16+1 16+1 16+1 17+0 16+0 16+0 16+0 17+0 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 18+1 17+1 17+1 17+1 17+1 17+1
17951 231381 https://sofifa.com/player/231381/jordan-young/180002 J. Young Jordan Young 17 1999-07-31 175 71 Scotland Swindon Town English League Two 4 46 61 60000 2000 ST Left 1 2 2 Medium/Medium Lean No 143000 SUB 26 2015-10-17 2019 NA 58 47 35 43 20 33 NA NA NA NA NA NA 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11 45+1 45+1 45+1 44+0 45+0 45+0 45+0 44+0 44+1 44+1 44+1 42+1 38+1 38+1 38+1 42+1 32+1 29+1 29+1 29+1 32+1 31+1 28+1 28+1 28+1 31+1
17952 240404 https://sofifa.com/player/240404/jack-keeble/180002 J. Keeble Jack Keeble 18 1999-03-22 172 66 England Grimsby Town English League Two 4 46 56 40000 1000 CB Right 1 2 2 Low/Medium Lean No 78000 SUB 24 2017-06-30 2018 NA 63 20 29 34 46 45 NA NA NA NA NA NA 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11 33+1 33+1 33+1 34+0 33+0 33+0 33+0 34+0 32+1 32+1 32+1 35+1 34+1 34+1 34+1 35+1 44+1 41+1 41+1 41+1 44+1 46+1 45+1 45+1 45+1 46+1
17953 11728 https://sofifa.com/player/11728/barry-richardson/180002 B. Richardson Barry Richardson 47 1969-08-05 185 77 England Wycombe Wanderers English League Two 4 46 46 2000 1000 GK Right 1 2 1 Medium/Medium Stocky No NA SUB 13 2014-01-30 2022 NA NA NA NA NA NA NA 39 50 39 37 25 50 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37 19+1 19+1 19+1 19+0 19+0 19+0 19+0 19+0 19+1 19+1 19+1 19+1 19+1 19+1 19+1 19+1 20+1 21+1 21+1 21+1 20+1 20+1 22+1 22+1 22+1 20+1
17954 235352 https://sofifa.com/player/235352/tommy-kassemodel/180002 T. Käßemodel Tommy Käßemodel 28 1988-08-09 173 75 Germany FC Erzgebirge Aue German 2. Bundesliga 2 46 46 30000 2000 CM Right 1 3 2 Medium/Medium Stocky No 47000 RES 29 2016-07-01 2018 NA 23 42 48 45 36 38 NA NA NA NA NA NA 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6 41+1 41+1 41+1 41+0 42+0 42+0 42+0 41+0 44+1 44+1 44+1 42+1 45+1 45+1 45+1 42+1 38+1 42+1 42+1 42+1 38+1 37+1 38+1 38+1 38+1 37+1
If you would like to skip the data preparation and exploration, jump straight to the hierarchical clustering or K-means clustering section.
Wrangling Data
# Make a copy of data frame, subset to specific columns
fifa_clean_r <- fifa_r[, 47:80]
# View structure of data frame
str(object=fifa_clean_r)
'data.frame': 17954 obs. of 34 variables:
$ attacking_crossing : int 85 77 75 15 77 62 17 80 66 68 ...
$ attacking_finishing : int 94 95 89 13 94 91 13 83 60 91 ...
$ attacking_heading_accuracy: int 88 71 62 25 77 85 21 57 91 86 ...
$ attacking_short_passing : int 83 88 81 55 83 83 50 86 78 75 ...
$ attacking_volleys : int 88 85 83 11 88 87 13 79 66 88 ...
$ skill_dribbling : int 91 97 96 30 86 85 18 93 61 84 ...
$ skill_curve : int 81 89 81 14 86 77 21 82 73 74 ...
$ skill_fk_accuracy : int 76 90 84 11 84 84 19 79 67 62 ...
$ skill_long_passing : int 77 87 75 59 64 65 51 81 72 59 ...
$ skill_ball_control : int 93 95 95 48 91 89 42 92 84 85 ...
$ movement_acceleration : int 89 92 94 58 88 79 57 93 75 78 ...
$ movement_sprint_speed : int 91 87 90 61 77 83 58 87 77 80 ...
$ movement_agility : int 89 90 96 52 86 78 60 93 79 75 ...
$ movement_reactions : int 96 95 88 85 93 91 88 85 85 88 ...
$ movement_balance : int 63 95 82 35 60 80 43 91 60 69 ...
$ power_shot_power : int 94 85 80 25 87 88 31 79 79 88 ...
$ power_jumping : int 95 68 61 78 69 84 67 59 93 79 ...
$ power_stamina : int 92 73 78 44 89 79 40 79 84 72 ...
$ power_strength : int 80 59 53 83 80 84 64 65 81 85 ...
$ power_long_shots : int 92 88 77 16 86 83 12 82 55 82 ...
$ mentality_aggression : int 63 48 56 29 78 80 38 54 84 50 ...
$ mentality_interceptions : int 29 22 36 30 41 39 30 41 88 20 ...
$ mentality_positioning : int 95 93 90 12 92 91 12 85 52 92 ...
$ mentality_vision : int 85 90 80 70 84 78 68 86 63 70 ...
$ mentality_penalties : int 85 74 81 47 85 81 40 86 68 70 ...
$ mentality_composure : int 95 96 92 70 83 87 64 87 80 86 ...
$ defending_marking : int 22 13 21 10 30 25 13 25 86 12 ...
$ defending_standing_tackle : int 31 28 24 10 45 42 21 27 89 22 ...
$ defending_sliding_tackle : int 23 26 33 11 38 19 13 22 91 18 ...
$ goalkeeping_diving : int 7 6 9 91 27 15 90 11 11 5 ...
$ goalkeeping_handling : int 11 11 9 90 25 6 85 12 8 12 ...
$ goalkeeping_kicking : int 15 15 15 95 31 12 87 6 9 7 ...
$ goalkeeping_positioning : int 14 14 15 91 33 8 86 8 7 5 ...
$ goalkeeping_reflexes : int 11 8 11 89 37 10 90 8 11 10 ...
head(x=fifa_clean_r, n=8)
attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes
1 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11
2 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8
3 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11
4 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89
5 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37
6 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10
7 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90
8 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8
tail(x=fifa_clean_r, n=8)
attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes
17947 42 36 38 40 25 47 32 23 38 40 71 70 69 31 71 25 48 50 53 34 28 32 46 43 20 45 39 30 30 5 8 14 14 9
17948 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7
17949 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9
17950 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48
17951 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11
17952 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11
17953 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37
17954 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6
# Make a copy of data frame, subset to specific columns
fifa_clean_py = fifa_py.iloc[:, 46:80]
# View structure of data frame
fifa_clean_py.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17954 entries, 0 to 17953
Data columns (total 34 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 attacking_crossing 17954 non-null int64
1 attacking_finishing 17954 non-null int64
2 attacking_heading_accuracy 17954 non-null int64
3 attacking_short_passing 17954 non-null int64
4 attacking_volleys 17954 non-null int64
5 skill_dribbling 17954 non-null int64
6 skill_curve 17954 non-null int64
7 skill_fk_accuracy 17954 non-null int64
8 skill_long_passing 17954 non-null int64
9 skill_ball_control 17954 non-null int64
10 movement_acceleration 17954 non-null int64
11 movement_sprint_speed 17954 non-null int64
12 movement_agility 17954 non-null int64
13 movement_reactions 17954 non-null int64
14 movement_balance 17954 non-null int64
15 power_shot_power 17954 non-null int64
16 power_jumping 17954 non-null int64
17 power_stamina 17954 non-null int64
18 power_strength 17954 non-null int64
19 power_long_shots 17954 non-null int64
20 mentality_aggression 17954 non-null int64
21 mentality_interceptions 17954 non-null int64
22 mentality_positioning 17954 non-null int64
23 mentality_vision 17954 non-null int64
24 mentality_penalties 17954 non-null int64
25 mentality_composure 17954 non-null int64
26 defending_marking 17954 non-null int64
27 defending_standing_tackle 17954 non-null int64
28 defending_sliding_tackle 17954 non-null int64
29 goalkeeping_diving 17954 non-null int64
30 goalkeeping_handling 17954 non-null int64
31 goalkeeping_kicking 17954 non-null int64
32 goalkeeping_positioning 17954 non-null int64
33 goalkeeping_reflexes 17954 non-null int64
dtypes: int64(34)
memory usage: 4.7 MB
fifa_clean_py.head(n=8)
attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes
0 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11
1 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8
2 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11
3 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89
4 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37
5 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10
6 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90
7 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8
fifa_clean_py.tail(n=8)
attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes
17946 42 36 38 40 25 47 32 23 38 40 71 70 69 31 71 25 48 50 53 34 28 32 46 43 20 45 39 30 30 5 8 14 14 9
17947 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7
17948 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9
17949 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48
17950 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11
17951 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11
17952 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37
17953 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6
# Make a copy of data frame, subset to specific columns, view structure of data frame
fifa_clean_jl = fifa_jl[:, 47:80]
17954×34 DataFrame
Row │ attacking_crossing attacking_finishing attacking_heading_accuracy attacking_short_passing attacking_volleys skill_dribbling skill_curve skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration movement_sprint_speed movement_agility movement_reactions movement_balance power_shot_power power_jumping power_stamina power_strength power_long_shots mentality_aggression mentality_interceptions mentality_positioning mentality_vision mentality_penalties mentality_composure defending_marking defending_standing_tackle defending_sliding_tackle goalkeeping_diving goalkeeping_handling goalkeeping_kicking goalkeeping_positioning goalkeeping_reflexes
│ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 85 94 88 83 88 91 81 76 77 93 89 91 89 96 63 94 95 92 80 92 63 29 95 85 85 95 22 31 23 7 11 15 14 11
2 │ 77 95 71 88 85 97 89 90 87 95 92 87 90 95 95 85 68 73 59 88 48 22 93 90 74 96 13 28 26 6 11 15 14 8
3 │ 75 89 62 81 83 96 81 84 75 95 94 90 96 88 82 80 61 78 53 77 56 36 90 80 81 92 21 24 33 9 9 15 15 11
4 │ 15 13 25 55 11 30 14 11 59 48 58 61 52 85 35 25 78 44 83 16 29 30 12 70 47 70 10 10 11 91 90 95 91 89
5 │ 77 94 77 83 88 86 86 84 64 91 88 77 86 93 60 87 69 89 80 86 78 41 92 84 85 83 30 45 38 27 25 31 33 37
6 │ 62 91 85 83 87 85 77 84 65 89 79 83 78 91 80 88 84 79 84 83 80 39 91 78 81 87 25 42 19 15 6 12 8 10
7 │ 17 13 21 50 13 18 21 19 51 42 57 58 60 88 43 31 67 40 64 12 38 30 12 68 40 64 13 21 13 90 85 87 86 90
8 │ 80 83 57 86 79 93 82 79 81 92 93 87 93 85 91 79 59 79 65 82 54 41 85 86 86 87 25 27 22 11 12 6 8 8
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
17948 │ 19 20 48 31 19 23 17 17 24 32 48 49 49 40 47 21 60 55 67 17 52 38 20 22 21 33 38 44 43 15 8 10 10 7
17949 │ 34 32 40 49 25 41 30 34 44 43 57 58 58 49 74 43 56 49 46 32 46 46 37 51 43 45 43 48 47 10 13 7 8 9
17950 │ 14 5 10 19 6 12 13 12 21 12 24 32 38 40 26 19 31 28 50 7 16 9 6 26 17 23 9 11 10 46 47 49 42 48
17951 │ 28 47 47 42 33 37 32 25 30 41 66 51 60 54 77 42 73 33 32 51 26 16 46 37 58 50 18 17 14 11 15 12 12 11
17952 │ 28 15 43 30 24 29 28 27 27 34 66 60 45 48 48 30 54 52 42 16 40 48 27 28 25 37 40 52 49 5 10 12 12 11
17953 │ 11 11 12 12 12 11 12 11 13 22 25 25 35 51 44 13 51 32 47 16 44 16 13 17 22 44 14 12 13 39 50 39 50 37
17954 │ 42 40 38 54 34 44 52 37 51 46 25 22 40 47 52 52 28 30 37 39 52 31 39 43 41 42 37 36 38 10 12 6 13 6
17939 rows omitted
Performing Exploratory Data Analysis (EDA)
Univariate Analysis on Age (Numerical/Discrete)
Histogram
ggplot2::ggplot(fifa_r, aes(x=age)) +
ggplot2::geom_histogram(binwidth = 1, fill = "#00a6c8") +
ggplot2::labs(
x = "Age",
y = "Count"
)
Summary Statistics
nrow(fifa_r$age)
NULL
unique(fifa_r$age)
[1] 32 30 25 31 28 26 29 27 39 24 23 33 35 34 36 21 22 18 20 19 37 38 17 40 44 41 16 43 47
summary(fifa_r$age)
Min. 1st Qu. Median Mean 3rd Qu. Max.
16 21 25 25 29 47
get_mode(fifa_r$age)
[1] 25
fifa_r %>% dplyr::summarize(
"Mean Absolute Deviation (MAD)" = mad(age),
"Variance" = var(age),
"Standard Deviation" = sd(age),
"Interquartile Range (IQR)" = IQR(age)
)
Mean Absolute Deviation (MAD) Variance Standard Deviation Interquartile Range (IQR)
1 5.9 21 4.6 8
Histogram
(
plotnine.ggplot(data = fifa_py, mapping = plotnine.mapping.aes(x = "age"))
+ plotnine.geoms.geom_histogram(stat = plotnine.stats.stat_bin(binwidth = 1))
+ plotnine.labels.labs(
x = "Age",
y = "Count"
)
)
<Figure Size: (1280 x 960)>
Summary Statistics
fifa_py.age.describe()
count 17954.000000
mean 25.164532
std 4.619855
min 16.000000
25% 21.000000
50% 25.000000
75% 29.000000
max 47.000000
Name: age, dtype: float64
Histogram
histogram_age = Gadfly.plot(
fifa_jl,
x=:age,
Gadfly.Geom.histogram(),
Gadfly.Coord.cartesian(xmin=15, xmax=50, ymin=0, ymax=1500),
Gadfly.Guide.xlabel("Age"),
Gadfly.Guide.ylabel("Count"),
Gadfly.Theme(
default_color=colorant"#00a6c8",
panel_fill=colorant"#f0f0f0",
background_color="white",
grid_color="white",
grid_line_width=4px,
minor_label_font_size=20px,
major_label_font_size=24px
)
);
Summary Statistics
describe(fifa_jl.age)
Summary Stats:
Length: 17954
Missing Count: 0
Mean: 25.164532
Minimum: 16.000000
1st Quartile: 21.000000
Median: 25.000000
3rd Quartile: 29.000000
Maximum: 47.000000
Type: Int64
Hierarchical Clustering (HClust)
# # Calculate distances
# distances_r <- round(dist(fifa_clean_r, method="euclidean"), 2)
# distances_r
#
# clusters__hclust_r <- hclust(distances_r, method="ward.D2")
# from scipy.cluster.hierarchy import linkage, fcluster
#
# z = linkage(fifa_py, method="ward", metric="euclidean")
# fifa_py["cluster_labels"] = fcluster(z, 2, criterion="maxclust")
# scipy.cluster.hierarchy.dendrogram(z)
scatterplot_sliding_tackle_aggression = Gadfly.plot(
fifa_clean_jl,
x=:defending_sliding_tackle,
y=:mentality_aggression,
Gadfly.Geom.point(),
# Gadfly.Coord.cartesian(xmin=15, xmax=50, ymin=0, ymax=1500),
Gadfly.Guide.xlabel("Defending Sliding Tackle"),
Gadfly.Guide.ylabel("Mentality Aggression"),
Gadfly.Theme(
default_color=colorant"#00a6c8",
panel_fill=colorant"#f0f0f0",
background_color="white",
grid_color="white",
grid_line_width=4px,
minor_label_font_size=20px,
major_label_font_size=24px
)
);
# fifa_clean_jl["scaled_sliding_tackle"] = whiten(fifa_clean_jl["sliding_tackle"])
# fifa_clean_jl["scaled_aggression"] = whiten(fifa_clean_jl["aggression"])
K-Means Clustering
# kmeans = scipy.cluster.vq.KMeans(n_clusters=3, metric='euclidean').fit(fifa_py)
# print(kmeans.labels_)
Model-Based Clustering
# clusters_model_based_r = mclust::Mclust(fifa_clean_r)
# summary(clusters_model_based_r)
#
# bic <- sapply(1:12,FUN=function(x) {
# Mclust(fifa_clean_r, G=x)$bic
# })
# bic_df <- data.frame(clusters=1:12, bic)
# ggplot(bic_df, aes(x=clusters, y=bic)) +
# geom_line(color='steelblue',size=1.4)+
# scale_x_continuous(breaks=1:12, minor_breaks = 1:12)
References
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning: with Applications in R (2nd ed.). Springer. https://doi.org/10.1007/978-1-0716-1418-1
- Shmueli, G., Patel, N. R., & Bruce, P. C. (2007). Data Mining for Business Intelligence. Wiley.
- Albright, S. C., Winston, W. L., & Zappe, C. (2003). Data Analysis for Managers with Microsoft Excel (2nd ed.). South-Western College Publishing.