Overview

Dataset statistics

Number of variables19
Number of observations33053760
Missing cells208735906
Missing cells (%)33.2%
Total size in memory4.7 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 32583584 (98.6%) missing valuesMissing
collater_valueofguarantee_876L has 31725982 (96.0%) missing valuesMissing
pmts_dpd_1073P has 27121373 (82.1%) missing valuesMissing
pmts_dpd_303P has 18691259 (56.5%) missing valuesMissing
pmts_month_158T has 23804232 (72.0%) missing valuesMissing
pmts_month_706T has 2619852 (7.9%) missing valuesMissing
pmts_overdue_1140A has 27091280 (82.0%) missing valuesMissing
pmts_overdue_1152A has 18674260 (56.5%) missing valuesMissing
pmts_year_1139T has 23804232 (72.0%) missing valuesMissing
pmts_year_507T has 2619852 (7.9%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 44.20781194)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 79.59866574)Skewed
pmts_dpd_303P is highly skewed (γ1 = 31.61005875)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 1218.634842)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 305.0209325)Skewed
collater_valueofguarantee_1124L has 430530 (1.3%) zerosZeros
collater_valueofguarantee_876L has 1178949 (3.6%) zerosZeros
num_group1 has 6717721 (20.3%) zerosZeros
num_group2 has 1377202 (4.2%) zerosZeros
pmts_dpd_1073P has 5596903 (16.9%) zerosZeros
pmts_dpd_303P has 12092577 (36.6%) zerosZeros
pmts_overdue_1140A has 5621230 (17.0%) zerosZeros
pmts_overdue_1152A has 12008791 (36.3%) zerosZeros

Reproduction

Analysis started2024-02-13 19:48:47.646804
Analysis finished2024-02-13 19:49:42.540205
Duration54.89 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct231250
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1457854.653
Minimum36830
Maximum2658153
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:44.320714image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum36830
5-th percentile180883
Q1917023
median1666366
Q31716789
95-th percentile2648671
Maximum2658153
Range2621323
Interquartile range (IQR)799766

Descriptive statistics

Standard deviation657453.4552
Coefficient of variation (CV)0.4509732531
Kurtosis0.03472349159
Mean1457854.653
Median Absolute Deviation (MAD)58227
Skewness-0.4679276101
Sum4.81875778 × 1013
Variance4.322450457 × 1011
MonotonicityIncreasing
2024-02-13T20:49:44.968019image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1706379 5986
 
< 0.1%
188150 3720
 
< 0.1%
1653560 3540
 
< 0.1%
179043 2604
 
< 0.1%
887695 2363
 
< 0.1%
185622 2328
 
< 0.1%
1741318 2124
 
< 0.1%
1663882 2112
 
< 0.1%
895818 2112
 
< 0.1%
869668 2040
 
< 0.1%
Other values (231240) 33024831
99.9%
ValueCountFrequency (%)
36830 144
< 0.1%
36883 36
 
< 0.1%
37083 60
< 0.1%
37128 36
 
< 0.1%
37129 72
< 0.1%
ValueCountFrequency (%)
2658153 36
 
< 0.1%
2658152 156
< 0.1%
2658151 348
< 0.1%
2658150 228
< 0.1%
2658149 156
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:45.189701image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters264430080
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 32583584
98.6%
9a0c095e 320167
 
1.0%
8fd95e4b 149736
 
0.5%
06fb9ba8 273
 
< 0.1%
2024-02-13T20:49:45.555172image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 98220655
37.1%
a 32904024
 
12.4%
b 32733866
 
12.4%
4 32733320
 
12.4%
7 32583584
 
12.3%
1 32583584
 
12.3%
9 790343
 
0.3%
0 640607
 
0.2%
e 469903
 
0.2%
c 320167
 
0.1%
Other values (4) 450027
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 197702375
74.8%
Lowercase Letter 66727705
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 98220655
49.7%
4 32733320
 
16.6%
7 32583584
 
16.5%
1 32583584
 
16.5%
9 790343
 
0.4%
0 640607
 
0.3%
8 150009
 
0.1%
6 273
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 32904024
49.3%
b 32733866
49.1%
e 469903
 
0.7%
c 320167
 
0.5%
f 150009
 
0.2%
d 149736
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 197702375
74.8%
Latin 66727705
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 98220655
49.7%
4 32733320
 
16.6%
7 32583584
 
16.5%
1 32583584
 
16.5%
9 790343
 
0.4%
0 640607
 
0.3%
8 150009
 
0.1%
6 273
 
< 0.1%
Latin
ValueCountFrequency (%)
a 32904024
49.3%
b 32733866
49.1%
e 469903
 
0.7%
c 320167
 
0.5%
f 150009
 
0.2%
d 149736
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 98220655
37.1%
a 32904024
 
12.4%
b 32733866
 
12.4%
4 32733320
 
12.4%
7 32583584
 
12.3%
1 32583584
 
12.3%
9 790343
 
0.3%
0 640607
 
0.2%
e 469903
 
0.2%
c 320167
 
0.1%
Other values (4) 450027
 
0.2%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:45.744176image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters264430080
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 31725982
96.0%
9a0c095e 721659
 
2.2%
8fd95e4b 603656
 
1.8%
06fb9ba8 2129
 
< 0.1%
3cbe86ba 322
 
< 0.1%
c7a5ad39 10
 
< 0.1%
9276e4bb 2
 
< 0.1%
2024-02-13T20:49:46.149595image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 96503271
36.5%
a 32450112
 
12.3%
b 32334544
 
12.2%
4 32329640
 
12.2%
7 31725994
 
12.0%
1 31725982
 
12.0%
9 2049115
 
0.8%
0 1445447
 
0.5%
e 1325639
 
0.5%
c 721991
 
0.3%
Other values (6) 1818345
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 196388343
74.3%
Lowercase Letter 68041737
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 96503271
49.1%
4 32329640
 
16.5%
7 31725994
 
16.2%
1 31725982
 
16.2%
9 2049115
 
1.0%
0 1445447
 
0.7%
8 606107
 
0.3%
6 2453
 
< 0.1%
3 332
 
< 0.1%
2 2
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 32450112
47.7%
b 32334544
47.5%
e 1325639
 
1.9%
c 721991
 
1.1%
f 605785
 
0.9%
d 603666
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 196388343
74.3%
Latin 68041737
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 96503271
49.1%
4 32329640
 
16.5%
7 31725994
 
16.2%
1 31725982
 
16.2%
9 2049115
 
1.0%
0 1445447
 
0.7%
8 606107
 
0.3%
6 2453
 
< 0.1%
3 332
 
< 0.1%
2 2
 
< 0.1%
Latin
ValueCountFrequency (%)
a 32450112
47.7%
b 32334544
47.5%
e 1325639
 
1.9%
c 721991
 
1.1%
f 605785
 
0.9%
d 603666
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 96503271
36.5%
a 32450112
 
12.3%
b 32334544
 
12.2%
4 32329640
 
12.2%
7 31725994
 
12.0%
1 31725982
 
12.0%
9 2049115
 
0.8%
0 1445447
 
0.5%
e 1325639
 
0.5%
c 721991
 
0.3%
Other values (6) 1818345
 
0.7%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct22269
Distinct (%)4.7%
Missing32583584
Missing (%)98.6%
Infinite0
Infinite (%)0.0%
Mean10444179.1
Minimum0
Maximum1.875995509 × 1010
Zeros430530
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:46.353894image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5887446.5
Maximum1.875995509 × 1010
Range1.875995509 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation279746416.4
Coefficient of variation (CV)26.78491183
Kurtosis2151.88332
Mean10444179.1
Median Absolute Deviation (MAD)0
Skewness44.20781194
Sum4.910602354 × 1012
Variance7.825805749 × 1016
MonotonicityNot monotonic
2024-02-13T20:49:46.521121image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 430530
 
1.3%
1300000000 427
 
< 0.1%
3000000 250
 
< 0.1%
4000000 211
 
< 0.1%
1 196
 
< 0.1%
2000000 187
 
< 0.1%
5000000 183
 
< 0.1%
10000000 173
 
< 0.1%
500000000 160
 
< 0.1%
20000000 160
 
< 0.1%
Other values (22259) 37699
 
0.1%
(Missing) 32583584
98.6%
ValueCountFrequency (%)
0 430530
1.3%
0.1 2
 
< 0.1%
1 196
 
< 0.1%
62.87 1
 
< 0.1%
600 1
 
< 0.1%
ValueCountFrequency (%)
1.875995509 × 10102
 
< 0.1%
1.758995509 × 10102
 
< 0.1%
1.59 × 10102
 
< 0.1%
1.43 × 1010126
< 0.1%
1.29 × 10102
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct45779
Distinct (%)3.4%
Missing31725982
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean4160211.827
Minimum0
Maximum1.758995509 × 1010
Zeros1178949
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:46.735593image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile205000
Maximum1.758995509 × 1010
Range1.758995509 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation153782034.7
Coefficient of variation (CV)36.96495301
Kurtosis7171.495339
Mean4160211.827
Median Absolute Deviation (MAD)0
Skewness79.59866574
Sum5.523837739 × 1012
Variance2.364891419 × 1016
MonotonicityNot monotonic
2024-02-13T20:49:46.931041image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1178949
 
3.6%
60000 3771
 
< 0.1%
130000 2951
 
< 0.1%
100000 2693
 
< 0.1%
50000 1947
 
< 0.1%
65000 1855
 
< 0.1%
80000 1310
 
< 0.1%
150000 1247
 
< 0.1%
70000 1238
 
< 0.1%
200000 1193
 
< 0.1%
Other values (45769) 130624
 
0.4%
(Missing) 31725982
96.0%
ValueCountFrequency (%)
0 1178949
3.6%
0.01 12
 
< 0.1%
0.02 50
 
< 0.1%
0.03 28
 
< 0.1%
0.04 3
 
< 0.1%
ValueCountFrequency (%)
1.758995509 × 10102
 
< 0.1%
1.59 × 10102
 
< 0.1%
1.43 × 1010117
< 0.1%
6903800000 8
 
< 0.1%
6804986362 32
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:47.129258image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters264430080
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3cbe86ba
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 31725982
96.0%
c7a5ad39 1002634
 
3.0%
3cbe86ba 223621
 
0.7%
9276e4bb 35479
 
0.1%
0e63c0f0 20524
 
0.1%
168ad9f3 9980
 
< 0.1%
7b62420e 8534
 
< 0.1%
5224034a 7598
 
< 0.1%
940efad7 6724
 
< 0.1%
2fd21cf1 3780
 
< 0.1%
Other values (5) 8904
 
< 0.1%
2024-02-13T20:49:47.432823image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 96193183
36.4%
a 33987913
 
12.9%
7 32784858
 
12.4%
b 32255626
 
12.2%
4 31802888
 
12.0%
1 31744990
 
12.0%
3 1267894
 
0.5%
c 1255564
 
0.5%
9 1061891
 
0.4%
d 1025575
 
0.4%
Other values (6) 1049698
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 195561669
74.0%
Lowercase Letter 68868411
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 96193183
49.2%
7 32784858
 
16.8%
4 31802888
 
16.3%
1 31744990
 
16.2%
3 1267894
 
0.6%
9 1061891
 
0.5%
6 301048
 
0.2%
8 237526
 
0.1%
0 89631
 
< 0.1%
2 77760
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 33987913
49.4%
b 32255626
46.8%
c 1255564
 
1.8%
d 1025575
 
1.5%
e 296350
 
0.4%
f 47383
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 195561669
74.0%
Latin 68868411
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 96193183
49.2%
7 32784858
 
16.8%
4 31802888
 
16.3%
1 31744990
 
16.2%
3 1267894
 
0.6%
9 1061891
 
0.5%
6 301048
 
0.2%
8 237526
 
0.1%
0 89631
 
< 0.1%
2 77760
 
< 0.1%
Latin
ValueCountFrequency (%)
a 33987913
49.4%
b 32255626
46.8%
c 1255564
 
1.8%
d 1025575
 
1.5%
e 296350
 
0.4%
f 47383
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 96193183
36.4%
a 33987913
 
12.9%
7 32784858
 
12.4%
b 32255626
 
12.2%
4 31802888
 
12.0%
1 31744990
 
12.0%
3 1267894
 
0.5%
c 1255564
 
0.5%
9 1061891
 
0.4%
d 1025575
 
0.4%
Other values (6) 1049698
 
0.4%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:47.603910image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters264430080
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 32583584
98.6%
c7a5ad39 416281
 
1.3%
9276e4bb 20888
 
0.1%
0e63c0f0 14267
 
< 0.1%
168ad9f3 7741
 
< 0.1%
7b62420e 5328
 
< 0.1%
3cbe86ba 1803
 
< 0.1%
f4d8a027 1344
 
< 0.1%
940efad7 1223
 
< 0.1%
2fd21cf1 575
 
< 0.1%
Other values (5) 726
 
< 0.1%
2024-02-13T20:49:48.094971image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 98167722
37.1%
a 33428888
 
12.6%
7 33028817
 
12.5%
b 32634458
 
12.3%
4 32613528
 
12.3%
1 32592602
 
12.3%
9 446501
 
0.2%
3 440654
 
0.2%
c 433237
 
0.2%
d 427164
 
0.2%
Other values (6) 216509
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 197436967
74.7%
Lowercase Letter 66993113
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 98167722
49.7%
7 33028817
 
16.7%
4 32613528
 
16.5%
1 32592602
 
16.5%
9 446501
 
0.2%
3 440654
 
0.2%
0 51143
 
< 0.1%
6 50191
 
< 0.1%
2 34794
 
< 0.1%
8 11015
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 33428888
49.9%
b 32634458
48.7%
c 433237
 
0.6%
d 427164
 
0.6%
e 43636
 
0.1%
f 25730
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 197436967
74.7%
Latin 66993113
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 98167722
49.7%
7 33028817
 
16.7%
4 32613528
 
16.5%
1 32592602
 
16.5%
9 446501
 
0.2%
3 440654
 
0.2%
0 51143
 
< 0.1%
6 50191
 
< 0.1%
2 34794
 
< 0.1%
8 11015
 
< 0.1%
Latin
ValueCountFrequency (%)
a 33428888
49.9%
b 32634458
48.7%
c 433237
 
0.6%
d 427164
 
0.6%
e 43636
 
0.1%
f 25730
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 98167722
37.1%
a 33428888
 
12.6%
7 33028817
 
12.5%
b 32634458
 
12.3%
4 32613528
 
12.3%
1 32592602
 
12.3%
9 446501
 
0.2%
3 440654
 
0.2%
c 433237
 
0.2%
d 427164
 
0.2%
Other values (6) 216509
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct272
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.701797224
Minimum0
Maximum271
Zeros6717721
Zeros (%)20.3%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:48.265227image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile15
Maximum271
Range271
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.427976775
Coefficient of variation (CV)1.367131858
Kurtosis106.8148704
Mean4.701797224
Median Absolute Deviation (MAD)3
Skewness6.169206137
Sum155412077
Variance41.31888542
MonotonicityNot monotonic
2024-02-13T20:49:48.452375image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6717721
20.3%
1 4900392
14.8%
2 3807814
11.5%
3 3071154
9.3%
4 2522337
 
7.6%
5 2085478
 
6.3%
6 1727995
 
5.2%
7 1425804
 
4.3%
8 1175784
 
3.6%
9 975470
 
3.0%
Other values (262) 4643811
14.0%
ValueCountFrequency (%)
0 6717721
20.3%
1 4900392
14.8%
2 3807814
11.5%
3 3071154
9.3%
4 2522337
 
7.6%
ValueCountFrequency (%)
271 12
< 0.1%
270 12
< 0.1%
269 12
< 0.1%
268 12
< 0.1%
267 12
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.20154557
Minimum0
Maximum100
Zeros1377202
Zeros (%)4.2%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:48.687217image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile31
Maximum100
Range100
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.278772653
Coefficient of variation (CV)0.7028550257
Kurtosis-0.4330814259
Mean13.20154557
Median Absolute Deviation (MAD)7
Skewness0.545312801
Sum436360719
Variance86.09562195
MonotonicityNot monotonic
2024-02-13T20:49:48.858454image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1377202
 
4.2%
1 1377070
 
4.2%
7 1377069
 
4.2%
11 1377069
 
4.2%
10 1377069
 
4.2%
9 1377069
 
4.2%
8 1377069
 
4.2%
6 1377069
 
4.2%
5 1377069
 
4.2%
4 1377069
 
4.2%
Other values (91) 19282936
58.3%
ValueCountFrequency (%)
0 1377202
4.2%
1 1377070
4.2%
2 1377069
4.2%
3 1377069
4.2%
4 1377069
4.2%
ValueCountFrequency (%)
100 1
 
< 0.1%
99 1
 
< 0.1%
98 60
< 0.1%
97 60
< 0.1%
96 60
< 0.1%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3971
Distinct (%)0.1%
Missing27121373
Missing (%)82.1%
Infinite0
Infinite (%)0.0%
Mean12.62910764
Minimum0
Maximum4595
Zeros5596903
Zeros (%)16.9%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:49.048257image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4595
Range4595
Interquartile range (IQR)0

Descriptive statistics

Standard deviation143.0431327
Coefficient of variation (CV)11.32646397
Kurtosis325.6255625
Mean12.62910764
Median Absolute Deviation (MAD)0
Skewness16.44645718
Sum74920754
Variance20461.33781
MonotonicityNot monotonic
2024-02-13T20:49:49.206116image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5596903
 
16.9%
1 55318
 
0.2%
2 20595
 
0.1%
3 19988
 
0.1%
4 16944
 
0.1%
7 10950
 
< 0.1%
5 10159
 
< 0.1%
6 9495
 
< 0.1%
8 7566
 
< 0.1%
10 7525
 
< 0.1%
Other values (3961) 176944
 
0.5%
(Missing) 27121373
82.1%
ValueCountFrequency (%)
0 5596903
16.9%
1 55318
 
0.2%
2 20595
 
0.1%
3 19988
 
0.1%
4 16944
 
0.1%
ValueCountFrequency (%)
4595 1
< 0.1%
4590 1
< 0.1%
4562 1
< 0.1%
4537 1
< 0.1%
4536 2
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4338
Distinct (%)< 0.1%
Missing18691259
Missing (%)56.5%
Infinite0
Infinite (%)0.0%
Mean54.52469796
Minimum-30
Maximum117000
Zeros12092577
Zeros (%)36.6%
Negative2579
Negative (%)< 0.1%
Memory size252.2 MiB
2024-02-13T20:49:49.484190image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-30
5-th percentile0
Q10
median0
Q30
95-th percentile322
Maximum117000
Range117030
Interquartile range (IQR)0

Descriptive statistics

Standard deviation274.7676134
Coefficient of variation (CV)5.039323897
Kurtosis8288.849719
Mean54.52469796
Median Absolute Deviation (MAD)0
Skewness31.61005875
Sum783111029
Variance75497.24137
MonotonicityNot monotonic
2024-02-13T20:49:49.661152image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 12092577
36.6%
1 333585
 
1.0%
3 87221
 
0.3%
2 82689
 
0.3%
4 71130
 
0.2%
6 58203
 
0.2%
5 46779
 
0.1%
7 45654
 
0.1%
9 32873
 
0.1%
8 31830
 
0.1%
Other values (4328) 1479960
 
4.5%
(Missing) 18691259
56.5%
ValueCountFrequency (%)
-30 1
 
< 0.1%
-10 11
 
< 0.1%
-9 2
 
< 0.1%
-8 24
< 0.1%
-7 57
< 0.1%
ValueCountFrequency (%)
117000 1
 
< 0.1%
84575 3
< 0.1%
84574 1
 
< 0.1%
84561 1
 
< 0.1%
84560 1
 
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing23804232
Missing (%)72.0%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:49.808939image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052716
Coefficient of variation (CV)0.5310850333
Kurtosis-1.216783226
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum60121932
Variance11.91666796
MonotonicityNot monotonic
2024-02-13T20:49:49.927924image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 770794
 
2.3%
3 770794
 
2.3%
4 770794
 
2.3%
5 770794
 
2.3%
6 770794
 
2.3%
7 770794
 
2.3%
8 770794
 
2.3%
9 770794
 
2.3%
10 770794
 
2.3%
11 770794
 
2.3%
Other values (2) 1541588
 
4.7%
(Missing) 23804232
72.0%
ValueCountFrequency (%)
1 770794
2.3%
2 770794
2.3%
3 770794
2.3%
4 770794
2.3%
5 770794
2.3%
ValueCountFrequency (%)
12 770794
2.3%
11 770794
2.3%
10 770794
2.3%
9 770794
2.3%
8 770794
2.3%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing2619852
Missing (%)7.9%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:50.043108image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052586
Coefficient of variation (CV)0.5310850133
Kurtosis-1.21678322
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum197820402
Variance11.91666706
MonotonicityNot monotonic
2024-02-13T20:49:50.160357image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 2536159
7.7%
3 2536159
7.7%
4 2536159
7.7%
5 2536159
7.7%
6 2536159
7.7%
7 2536159
7.7%
8 2536159
7.7%
9 2536159
7.7%
10 2536159
7.7%
11 2536159
7.7%
Other values (2) 5072318
15.3%
(Missing) 2619852
7.9%
ValueCountFrequency (%)
1 2536159
7.7%
2 2536159
7.7%
3 2536159
7.7%
4 2536159
7.7%
5 2536159
7.7%
ValueCountFrequency (%)
12 2536159
7.7%
11 2536159
7.7%
10 2536159
7.7%
9 2536159
7.7%
8 2536159
7.7%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct241676
Distinct (%)4.1%
Missing27091280
Missing (%)82.0%
Infinite0
Infinite (%)0.0%
Mean2047.064177
Minimum0
Maximum401980320
Zeros5621230
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:50.306261image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile464.67798
Maximum401980320
Range401980320
Interquartile range (IQR)0

Descriptive statistics

Standard deviation265823.9755
Coefficient of variation (CV)129.8562001
Kurtosis1777281.069
Mean2047.064177
Median Absolute Deviation (MAD)0
Skewness1218.634842
Sum1.220557921 × 1010
Variance7.066238593 × 1010
MonotonicityNot monotonic
2024-02-13T20:49:50.478168image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5621230
 
17.0%
400 470
 
< 0.1%
14 388
 
< 0.1%
1000 387
 
< 0.1%
10 372
 
< 0.1%
2000 296
 
< 0.1%
0.8 251
 
< 0.1%
2600 250
 
< 0.1%
0.2 244
 
< 0.1%
2 231
 
< 0.1%
Other values (241666) 338361
 
1.0%
(Missing) 27091280
82.0%
ValueCountFrequency (%)
0 5621230
17.0%
0.002 24
 
< 0.1%
0.004 30
 
< 0.1%
0.006 25
 
< 0.1%
0.008 36
 
< 0.1%
ValueCountFrequency (%)
401980320 2
 
< 0.1%
132118310 2
 
< 0.1%
50082108 5
< 0.1%
48865336 1
 
< 0.1%
48513324 1
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct946534
Distinct (%)6.6%
Missing18674260
Missing (%)56.5%
Infinite0
Infinite (%)0.0%
Mean4190.192563
Minimum0
Maximum50906470
Zeros12008791
Zeros (%)36.3%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:50.722311image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile14885.33825
Maximum50906470
Range50906470
Interquartile range (IQR)0

Descriptive statistics

Standard deviation105340.4401
Coefficient of variation (CV)25.13976113
Kurtosis122510.518
Mean4190.192563
Median Absolute Deviation (MAD)0
Skewness305.0209325
Sum6.025287397 × 1010
Variance1.109660832 × 1010
MonotonicityNot monotonic
2024-02-13T20:49:50.911320image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 12008791
36.3%
0.2 7194
 
< 0.1%
1000 4299
 
< 0.1%
0.4 3012
 
< 0.1%
2000 2910
 
< 0.1%
3000 2481
 
< 0.1%
0.8 2258
 
< 0.1%
0.6 2066
 
< 0.1%
2 2066
 
< 0.1%
1.6 1957
 
< 0.1%
Other values (946524) 2342466
 
7.1%
(Missing) 18674260
56.5%
ValueCountFrequency (%)
0 12008791
36.3%
0.002 215
 
< 0.1%
0.004 113
 
< 0.1%
0.006 111
 
< 0.1%
0.008 89
 
< 0.1%
ValueCountFrequency (%)
50906470 1
< 0.1%
48977290 1
< 0.1%
48927640 1
< 0.1%
48927636 2
< 0.1%
48807636 1
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing23804232
Missing (%)72.0%
Infinite0
Infinite (%)0.0%
Mean2018.580244
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:51.044268image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2017
Q12018
median2019
Q32019
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7459096442
Coefficient of variation (CV)0.0003695219184
Kurtosis-0.04878475407
Mean2018.580244
Median Absolute Deviation (MAD)0
Skewness-0.534206743
Sum1.867091449 × 1010
Variance0.5563811973
MonotonicityNot monotonic
2024-02-13T20:49:51.165183image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2019 5182120
 
15.7%
2018 2635644
 
8.0%
2017 894940
 
2.7%
2020 529346
 
1.6%
2021 7214
 
< 0.1%
2016 264
 
< 0.1%
(Missing) 23804232
72.0%
ValueCountFrequency (%)
2016 264
 
< 0.1%
2017 894940
 
2.7%
2018 2635644
8.0%
2019 5182120
15.7%
2020 529346
 
1.6%
ValueCountFrequency (%)
2021 7214
 
< 0.1%
2020 529346
 
1.6%
2019 5182120
15.7%
2018 2635644
8.0%
2017 894940
 
2.7%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct21
Distinct (%)< 0.1%
Missing2619852
Missing (%)7.9%
Infinite0
Infinite (%)0.0%
Mean2014.328105
Minimum2001
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:51.289378image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2001
5-th percentile2007
Q12012
median2015
Q32018
95-th percentile2019
Maximum2021
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.93218005
Coefficient of variation (CV)0.001952105042
Kurtosis-0.6057916566
Mean2014.328105
Median Absolute Deviation (MAD)3
Skewness-0.6855102705
Sum6.130387622 × 1010
Variance15.46203994
MonotonicityNot monotonic
2024-02-13T20:49:51.447992image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2018 4590473
13.9%
2017 3791656
11.5%
2019 3340832
10.1%
2016 2736638
8.3%
2015 2443386
7.4%
2014 2274555
6.9%
2013 1980486
 
6.0%
2012 1628101
 
4.9%
2011 1421849
 
4.3%
2007 1196710
 
3.6%
Other values (11) 5029222
15.2%
(Missing) 2619852
7.9%
ValueCountFrequency (%)
2001 209
 
< 0.1%
2002 1108
 
< 0.1%
2003 2761
 
< 0.1%
2004 71995
 
0.2%
2005 342980
1.0%
ValueCountFrequency (%)
2021 157
 
< 0.1%
2020 270177
 
0.8%
2019 3340832
10.1%
2018 4590473
13.9%
2017 3791656
11.5%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:51.624098image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters264430080
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 31739743
96.0%
ab3c25cf 1288784
 
3.9%
15f04f45 13783
 
< 0.1%
be4fd70b 6821
 
< 0.1%
daf49a8a 4530
 
< 0.1%
71ddaa88 49
 
< 0.1%
0c42a10e 24
 
< 0.1%
1d94eac1 20
 
< 0.1%
9ba4314a 4
 
< 0.1%
652d52e3 2
 
< 0.1%
2024-02-13T20:49:51.939217image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 96535583
36.5%
a 33042267
 
12.5%
b 33042173
 
12.5%
4 31778712
 
12.0%
1 31753643
 
12.0%
7 31746613
 
12.0%
c 2577612
 
1.0%
f 1327701
 
0.5%
2 1288812
 
0.5%
3 1288790
 
0.5%
Other values (6) 48174
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 194421989
73.5%
Lowercase Letter 70008091
 
26.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 96535583
49.7%
4 31778712
 
16.3%
1 31753643
 
16.3%
7 31746613
 
16.3%
2 1288812
 
0.7%
3 1288790
 
0.7%
0 20652
 
< 0.1%
8 4628
 
< 0.1%
9 4554
 
< 0.1%
6 2
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 33042267
47.2%
b 33042173
47.2%
c 2577612
 
3.7%
f 1327701
 
1.9%
d 11471
 
< 0.1%
e 6867
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 194421989
73.5%
Latin 70008091
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 96535583
49.7%
4 31778712
 
16.3%
1 31753643
 
16.3%
7 31746613
 
16.3%
2 1288812
 
0.7%
3 1288790
 
0.7%
0 20652
 
< 0.1%
8 4628
 
< 0.1%
9 4554
 
< 0.1%
6 2
 
< 0.1%
Latin
ValueCountFrequency (%)
a 33042267
47.2%
b 33042173
47.2%
c 2577612
 
3.7%
f 1327701
 
1.9%
d 11471
 
< 0.1%
e 6867
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 96535583
36.5%
a 33042267
 
12.5%
b 33042173
 
12.5%
4 31778712
 
12.0%
1 31753643
 
12.0%
7 31746613
 
12.0%
c 2577612
 
1.0%
f 1327701
 
0.5%
2 1288812
 
0.5%
3 1288790
 
0.5%
Other values (6) 48174
 
< 0.1%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size252.2 MiB
2024-02-13T20:49:52.110889image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000020784
Min length8

Characters and Unicode

Total characters264430767
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 32600314
98.6%
ab3c25cf 441403
 
1.3%
be4fd70b 4584
 
< 0.1%
15f04f45 3496
 
< 0.1%
daf49a8a 3257
 
< 0.1%
p28_48_88 687
 
< 0.1%
0c42a10e 11
 
< 0.1%
71ddaa88 7
 
< 0.1%
652d52e3 1
 
< 0.1%
2024-02-13T20:49:52.498221image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 98249339
37.2%
a 33051513
 
12.5%
b 33050885
 
12.5%
4 32615845
 
12.3%
7 32604905
 
12.3%
1 32603828
 
12.3%
c 882817
 
0.3%
f 456236
 
0.2%
2 442103
 
0.2%
3 441404
 
0.2%
Other values (8) 31892
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 196974803
74.5%
Lowercase Letter 67453903
 
25.5%
Connector Punctuation 1374
 
< 0.1%
Uppercase Letter 687
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 98249339
49.9%
4 32615845
 
16.6%
7 32604905
 
16.6%
1 32603828
 
16.6%
2 442103
 
0.2%
3 441404
 
0.2%
0 8102
 
< 0.1%
8 6019
 
< 0.1%
9 3257
 
< 0.1%
6 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 33051513
49.0%
b 33050885
49.0%
c 882817
 
1.3%
f 456236
 
0.7%
d 7856
 
< 0.1%
e 4596
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1374
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 687
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 196976177
74.5%
Latin 67454590
 
25.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 98249339
49.9%
4 32615845
 
16.6%
7 32604905
 
16.6%
1 32603828
 
16.6%
2 442103
 
0.2%
3 441404
 
0.2%
0 8102
 
< 0.1%
8 6019
 
< 0.1%
9 3257
 
< 0.1%
_ 1374
 
< 0.1%
Latin
ValueCountFrequency (%)
a 33051513
49.0%
b 33050885
49.0%
c 882817
 
1.3%
f 456236
 
0.7%
d 7856
 
< 0.1%
e 4596
 
< 0.1%
P 687
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264430767
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 98249339
37.2%
a 33051513
 
12.5%
b 33050885
 
12.5%
4 32615845
 
12.3%
7 32604905
 
12.3%
1 32603828
 
12.3%
c 882817
 
0.3%
f 456236
 
0.2%
2 442103
 
0.2%
3 441404
 
0.2%
Other values (8) 31892
 
< 0.1%