Overview

Dataset statistics

Number of variables19
Number of observations4386062
Missing cells28029984
Missing cells (%)33.6%
Total size in memory635.8 MiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 4340725 (99.0%) missing valuesMissing
collater_valueofguarantee_876L has 4209154 (96.0%) missing valuesMissing
pmts_dpd_1073P has 3729399 (85.0%) missing valuesMissing
pmts_dpd_303P has 2445060 (55.7%) missing valuesMissing
pmts_month_158T has 3280058 (74.8%) missing valuesMissing
pmts_month_706T has 288770 (6.6%) missing valuesMissing
pmts_overdue_1140A has 3725455 (84.9%) missing valuesMissing
pmts_overdue_1152A has 2442535 (55.7%) missing valuesMissing
pmts_year_1139T has 3280058 (74.8%) missing valuesMissing
pmts_year_507T has 288770 (6.6%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 81.69064668)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 80.87769684)Skewed
pmts_dpd_1073P is highly skewed (γ1 = 26.9832622)Skewed
pmts_dpd_303P is highly skewed (γ1 = 92.41077498)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 105.9926051)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 186.8551368)Skewed
collater_valueofguarantee_876L has 161055 (3.7%) zerosZeros
num_group1 has 755928 (17.2%) zerosZeros
num_group2 has 178326 (4.1%) zerosZeros
pmts_dpd_1073P has 624821 (14.2%) zerosZeros
pmts_dpd_303P has 1622067 (37.0%) zerosZeros
pmts_overdue_1140A has 628168 (14.3%) zerosZeros
pmts_overdue_1152A has 1612253 (36.8%) zerosZeros

Reproduction

Analysis started2024-02-13 19:44:21.892070
Analysis finished2024-02-13 19:44:33.608998
Duration11.72 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct23734
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1570147.672
Minimum56408
Maximum2703454
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:33.761028image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum56408
5-th percentile256297
Q11023878
median1939098
Q31944519
95-th percentile2702346
Maximum2703454
Range2647046
Interquartile range (IQR)920641

Descriptive statistics

Standard deviation819484.4816
Coefficient of variation (CV)0.5219155474
Kurtosis-0.9099583482
Mean1570147.672
Median Absolute Deviation (MAD)6973
Skewness-0.5919843554
Sum6.886765038 × 1012
Variance6.715548156 × 1011
MonotonicityIncreasing
2024-02-13T20:44:33.937036image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1936653 2004
 
< 0.1%
257343 1716
 
< 0.1%
257052 1704
 
< 0.1%
1939829 1572
 
< 0.1%
1025405 1332
 
< 0.1%
255943 1320
 
< 0.1%
1945583 1272
 
< 0.1%
1937123 1224
 
< 0.1%
1939842 1212
 
< 0.1%
1942430 1212
 
< 0.1%
Other values (23724) 4371494
99.7%
ValueCountFrequency (%)
56408 108
< 0.1%
56451 24
 
< 0.1%
56556 192
< 0.1%
56579 84
< 0.1%
56703 24
 
< 0.1%
ValueCountFrequency (%)
2703454 252
< 0.1%
2703453 348
< 0.1%
2703452 72
 
< 0.1%
2703451 96
 
< 0.1%
2703450 252
< 0.1%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:34.106035image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters35088496
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4340725
99.0%
9a0c095e 31104
 
0.7%
8fd95e4b 14216
 
0.3%
06fb9ba8 14
 
< 0.1%
26cf31be 3
 
< 0.1%
2024-02-13T20:44:34.409703image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 13067495
37.2%
a 4371843
 
12.5%
b 4354972
 
12.4%
4 4354941
 
12.4%
1 4340728
 
12.4%
7 4340725
 
12.4%
9 76438
 
0.2%
0 62222
 
0.2%
e 45323
 
0.1%
c 31107
 
0.1%
Other values (6) 42702
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26256802
74.8%
Lowercase Letter 8831694
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 13067495
49.8%
4 4354941
 
16.6%
1 4340728
 
16.5%
7 4340725
 
16.5%
9 76438
 
0.3%
0 62222
 
0.2%
8 14230
 
0.1%
6 17
 
< 0.1%
2 3
 
< 0.1%
3 3
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4371843
49.5%
b 4354972
49.3%
e 45323
 
0.5%
c 31107
 
0.4%
f 14233
 
0.2%
d 14216
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 26256802
74.8%
Latin 8831694
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 13067495
49.8%
4 4354941
 
16.6%
1 4340728
 
16.5%
7 4340725
 
16.5%
9 76438
 
0.3%
0 62222
 
0.2%
8 14230
 
0.1%
6 17
 
< 0.1%
2 3
 
< 0.1%
3 3
 
< 0.1%
Latin
ValueCountFrequency (%)
a 4371843
49.5%
b 4354972
49.3%
e 45323
 
0.5%
c 31107
 
0.4%
f 14233
 
0.2%
d 14216
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 13067495
37.2%
a 4371843
 
12.5%
b 4354972
 
12.4%
4 4354941
 
12.4%
1 4340728
 
12.4%
7 4340725
 
12.4%
9 76438
 
0.2%
0 62222
 
0.2%
e 45323
 
0.1%
c 31107
 
0.1%
Other values (6) 42702
 
0.1%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:34.588858image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters35088496
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4209154
96.0%
9a0c095e 94501
 
2.2%
8fd95e4b 82177
 
1.9%
06fb9ba8 197
 
< 0.1%
3cbe86ba 32
 
< 0.1%
9276e4bb 1
 
< 0.1%
2024-02-13T20:44:34.865223image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 12804140
36.5%
a 4303884
 
12.3%
b 4291791
 
12.2%
4 4291332
 
12.2%
7 4209155
 
12.0%
1 4209154
 
12.0%
9 271377
 
0.8%
0 189199
 
0.5%
e 176711
 
0.5%
c 94533
 
0.3%
Other values (6) 247220
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26057026
74.3%
Lowercase Letter 9031470
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 12804140
49.1%
4 4291332
 
16.5%
7 4209155
 
16.2%
1 4209154
 
16.2%
9 271377
 
1.0%
0 189199
 
0.7%
8 82406
 
0.3%
6 230
 
< 0.1%
3 32
 
< 0.1%
2 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4303884
47.7%
b 4291791
47.5%
e 176711
 
2.0%
c 94533
 
1.0%
f 82374
 
0.9%
d 82177
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 26057026
74.3%
Latin 9031470
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 12804140
49.1%
4 4291332
 
16.5%
7 4209155
 
16.2%
1 4209154
 
16.2%
9 271377
 
1.0%
0 189199
 
0.7%
8 82406
 
0.3%
6 230
 
< 0.1%
3 32
 
< 0.1%
2 1
 
< 0.1%
Latin
ValueCountFrequency (%)
a 4303884
47.7%
b 4291791
47.5%
e 176711
 
2.0%
c 94533
 
1.0%
f 82374
 
0.9%
d 82177
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 12804140
36.5%
a 4303884
 
12.3%
b 4291791
 
12.2%
4 4291332
 
12.2%
7 4209155
 
12.0%
1 4209154
 
12.0%
9 271377
 
0.8%
0 189199
 
0.5%
e 176711
 
0.5%
c 94533
 
0.3%
Other values (6) 247220
 
0.7%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED 

Distinct2374
Distinct (%)5.2%
Missing4340725
Missing (%)99.0%
Infinite0
Infinite (%)0.0%
Mean1250201.297
Minimum0
Maximum3515294000
Zeros42324
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:35.015886image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3007364.784
Maximum3515294000
Range3515294000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation32106519.85
Coefficient of variation (CV)25.68108027
Kurtosis7871.848916
Mean1250201.297
Median Absolute Deviation (MAD)0
Skewness81.69064668
Sum5.668037618 × 1010
Variance1.030828617 × 1015
MonotonicityNot monotonic
2024-02-13T20:44:35.175532image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 42324
 
1.0%
500000 20
 
< 0.1%
12000000 19
 
< 0.1%
200000 15
 
< 0.1%
4000000 15
 
< 0.1%
11385000 14
 
< 0.1%
3000000 14
 
< 0.1%
2000000 13
 
< 0.1%
5000000 11
 
< 0.1%
30000000 11
 
< 0.1%
Other values (2364) 2881
 
0.1%
(Missing) 4340725
99.0%
ValueCountFrequency (%)
0 42324
1.0%
1 10
 
< 0.1%
1150 1
 
< 0.1%
2712 1
 
< 0.1%
3800 1
 
< 0.1%
ValueCountFrequency (%)
3515294000 1
 
< 0.1%
3200000000 2
< 0.1%
1800000000 1
 
< 0.1%
1050000000 1
 
< 0.1%
1000000000 4
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct6556
Distinct (%)3.7%
Missing4209154
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean786298.2358
Minimum0
Maximum3250000000
Zeros161055
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:35.333862image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile130000
Maximum3250000000
Range3250000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation31452025.18
Coefficient of variation (CV)40.00012178
Kurtosis7380.550392
Mean786298.2358
Median Absolute Deviation (MAD)0
Skewness80.87769684
Sum1.391024483 × 1011
Variance9.892298882 × 1014
MonotonicityNot monotonic
2024-02-13T20:44:35.497864image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 161055
 
3.7%
60000 464
 
< 0.1%
130000 359
 
< 0.1%
100000 345
 
< 0.1%
30000000 249
 
< 0.1%
50000 205
 
< 0.1%
65000 192
 
< 0.1%
70000 161
 
< 0.1%
150000 161
 
< 0.1%
80000 160
 
< 0.1%
Other values (6546) 13557
 
0.3%
(Missing) 4209154
96.0%
ValueCountFrequency (%)
0 161055
3.7%
0.99 3
 
< 0.1%
1 46
 
< 0.1%
1.8 2
 
< 0.1%
4 1
 
< 0.1%
ValueCountFrequency (%)
3250000000 5
< 0.1%
3200000000 5
< 0.1%
2000000000 11
< 0.1%
1200000000 7
< 0.1%
1015106700 1
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:35.675879image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters35088496
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3cbe86ba
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4209249
96.0%
c7a5ad39 140275
 
3.2%
3cbe86ba 24588
 
0.6%
9276e4bb 3686
 
0.1%
0e63c0f0 3265
 
0.1%
168ad9f3 1180
 
< 0.1%
5224034a 899
 
< 0.1%
7b62420e 837
 
< 0.1%
940efad7 755
 
< 0.1%
2fd21cf1 481
 
< 0.1%
Other values (5) 847
 
< 0.1%
2024-02-13T20:44:35.972291image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 12769513
36.4%
a 4517896
 
12.9%
7 4355347
 
12.4%
b 4267036
 
12.2%
4 4217196
 
12.0%
1 4211675
 
12.0%
3 170515
 
0.5%
c 169201
 
0.5%
9 146512
 
0.4%
d 142828
 
0.4%
Other values (6) 120777
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25951815
74.0%
Lowercase Letter 9136681
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 12769513
49.2%
7 4355347
 
16.8%
4 4217196
 
16.3%
1 4211675
 
16.2%
3 170515
 
0.7%
9 146512
 
0.6%
6 33958
 
0.1%
8 26189
 
0.1%
0 12653
 
< 0.1%
2 8257
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4517896
49.4%
b 4267036
46.7%
c 169201
 
1.9%
d 142828
 
1.6%
e 33415
 
0.4%
f 6305
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 25951815
74.0%
Latin 9136681
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 12769513
49.2%
7 4355347
 
16.8%
4 4217196
 
16.3%
1 4211675
 
16.2%
3 170515
 
0.7%
9 146512
 
0.6%
6 33958
 
0.1%
8 26189
 
0.1%
0 12653
 
< 0.1%
2 8257
 
< 0.1%
Latin
ValueCountFrequency (%)
a 4517896
49.4%
b 4267036
46.7%
c 169201
 
1.9%
d 142828
 
1.6%
e 33415
 
0.4%
f 6305
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 12769513
36.4%
a 4517896
 
12.9%
7 4355347
 
12.4%
b 4267036
 
12.2%
4 4217196
 
12.0%
1 4211675
 
12.0%
3 170515
 
0.5%
c 169201
 
0.5%
9 146512
 
0.4%
d 142828
 
0.4%
Other values (6) 120777
 
0.3%
Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:36.141646image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters35088496
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4340785
99.0%
c7a5ad39 41366
 
0.9%
9276e4bb 1591
 
< 0.1%
0e63c0f0 1109
 
< 0.1%
168ad9f3 442
 
< 0.1%
7b62420e 439
 
< 0.1%
940efad7 116
 
< 0.1%
f4d8a027 79
 
< 0.1%
3cbe86ba 51
 
< 0.1%
2fd21cf1 41
 
< 0.1%
Other values (4) 43
 
< 0.1%
2024-02-13T20:44:36.409563image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 13063762
37.2%
a 4424232
 
12.6%
7 4384396
 
12.5%
b 4344528
 
12.4%
4 4343058
 
12.4%
1 4341327
 
12.4%
9 43523
 
0.1%
3 42991
 
0.1%
c 42589
 
0.1%
d 42044
 
0.1%
Other values (6) 16046
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26229951
74.8%
Lowercase Letter 8858545
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 13063762
49.8%
7 4384396
 
16.7%
4 4343058
 
16.6%
1 4341327
 
16.6%
9 43523
 
0.2%
3 42991
 
0.2%
0 3984
 
< 0.1%
6 3652
 
< 0.1%
2 2668
 
< 0.1%
8 590
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4424232
49.9%
b 4344528
49.0%
c 42589
 
0.5%
d 42044
 
0.5%
e 3324
 
< 0.1%
f 1828
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 26229951
74.8%
Latin 8858545
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 13063762
49.8%
7 4384396
 
16.7%
4 4343058
 
16.6%
1 4341327
 
16.6%
9 43523
 
0.2%
3 42991
 
0.2%
0 3984
 
< 0.1%
6 3652
 
< 0.1%
2 2668
 
< 0.1%
8 590
 
< 0.1%
Latin
ValueCountFrequency (%)
a 4424232
49.9%
b 4344528
49.0%
c 42589
 
0.5%
d 42044
 
0.5%
e 3324
 
< 0.1%
f 1828
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 13063762
37.2%
a 4424232
 
12.6%
7 4384396
 
12.5%
b 4344528
 
12.4%
4 4343058
 
12.4%
1 4341327
 
12.4%
9 43523
 
0.1%
3 42991
 
0.1%
c 42589
 
0.1%
d 42044
 
0.1%
Other values (6) 16046
 
< 0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct120
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.491723555
Minimum0
Maximum119
Zeros755928
Zeros (%)17.2%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:36.582801image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q38
95-th percentile18
Maximum119
Range119
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.988234071
Coefficient of variation (CV)1.272502886
Kurtosis29.00008007
Mean5.491723555
Median Absolute Deviation (MAD)3
Skewness3.8286012
Sum24087040
Variance48.83541543
MonotonicityNot monotonic
2024-02-13T20:44:36.742684image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 755928
17.2%
1 587413
13.4%
2 476894
10.9%
3 397025
9.1%
4 336111
7.7%
5 286647
 
6.5%
6 242773
 
5.5%
7 205885
 
4.7%
8 173654
 
4.0%
9 145200
 
3.3%
Other values (110) 778532
17.8%
ValueCountFrequency (%)
0 755928
17.2%
1 587413
13.4%
2 476894
10.9%
3 397025
9.1%
4 336111
7.7%
ValueCountFrequency (%)
119 36
< 0.1%
118 36
< 0.1%
117 36
< 0.1%
116 48
< 0.1%
115 36
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.46609897
Minimum0
Maximum35
Zeros178326
Zeros (%)4.1%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:36.885938image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum35
Range35
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.366761932
Coefficient of variation (CV)0.6955809511
Kurtosis-0.6928494774
Mean13.46609897
Median Absolute Deviation (MAD)7
Skewness0.4895278852
Sum59063145
Variance87.73622909
MonotonicityNot monotonic
2024-02-13T20:44:37.029854image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 178326
 
4.1%
6 178306
 
4.1%
11 178306
 
4.1%
10 178306
 
4.1%
8 178306
 
4.1%
7 178306
 
4.1%
9 178306
 
4.1%
5 178306
 
4.1%
4 178306
 
4.1%
3 178306
 
4.1%
Other values (26) 2602982
59.3%
ValueCountFrequency (%)
0 178326
4.1%
1 178306
4.1%
2 178306
4.1%
3 178306
4.1%
4 178306
4.1%
ValueCountFrequency (%)
35 55441
1.3%
34 55441
1.3%
33 55441
1.3%
32 55441
1.3%
31 55441
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct1678
Distinct (%)0.3%
Missing3729399
Missing (%)85.0%
Infinite0
Infinite (%)0.0%
Mean5.19676455
Minimum0
Maximum4155
Zeros624821
Zeros (%)14.2%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:37.185455image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4155
Range4155
Interquartile range (IQR)0

Descriptive statistics

Standard deviation90.26236627
Coefficient of variation (CV)17.36895435
Kurtosis864.1461394
Mean5.19676455
Median Absolute Deviation (MAD)0
Skewness26.9832622
Sum3412523
Variance8147.294764
MonotonicityNot monotonic
2024-02-13T20:44:37.338451image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 624821
 
14.2%
1 5855
 
0.1%
2 2542
 
0.1%
3 1905
 
< 0.1%
4 1790
 
< 0.1%
7 1160
 
< 0.1%
5 1084
 
< 0.1%
6 1017
 
< 0.1%
8 1007
 
< 0.1%
10 838
 
< 0.1%
Other values (1668) 14644
 
0.3%
(Missing) 3729399
85.0%
ValueCountFrequency (%)
0 624821
14.2%
1 5855
 
0.1%
2 2542
 
0.1%
3 1905
 
< 0.1%
4 1790
 
< 0.1%
ValueCountFrequency (%)
4155 1
< 0.1%
4136 1
< 0.1%
4110 1
< 0.1%
4079 1
< 0.1%
4053 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct3930
Distinct (%)0.2%
Missing2445060
Missing (%)55.7%
Infinite0
Infinite (%)0.0%
Mean60.13521573
Minimum-4
Maximum144000
Zeros1622067
Zeros (%)37.0%
Negative257
Negative (%)< 0.1%
Memory size33.5 MiB
2024-02-13T20:44:37.483669image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-4
5-th percentile0
Q10
median0
Q30
95-th percentile383
Maximum144000
Range144004
Interquartile range (IQR)0

Descriptive statistics

Standard deviation329.9638175
Coefficient of variation (CV)5.48703141
Kurtosis30124.70291
Mean60.13521573
Median Absolute Deviation (MAD)0
Skewness92.41077498
Sum116722574
Variance108876.1209
MonotonicityNot monotonic
2024-02-13T20:44:37.647674image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1622067
37.0%
1 43495
 
1.0%
3 11351
 
0.3%
2 11135
 
0.3%
4 9405
 
0.2%
6 7649
 
0.2%
7 6425
 
0.1%
5 6305
 
0.1%
9 4578
 
0.1%
8 4463
 
0.1%
Other values (3920) 214129
 
4.9%
(Missing) 2445060
55.7%
ValueCountFrequency (%)
-4 11
 
< 0.1%
-3 25
 
< 0.1%
-2 68
 
< 0.1%
-1 153
 
< 0.1%
0 1622067
37.0%
ValueCountFrequency (%)
144000 1
< 0.1%
84574 1
< 0.1%
84561 2
< 0.1%
84532 1
< 0.1%
84505 1
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing3280058
Missing (%)74.8%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:37.788675image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.45205409
Coefficient of variation (CV)0.5310852446
Kurtosis-1.216783293
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum7189026
Variance11.91667744
MonotonicityNot monotonic
2024-02-13T20:44:37.902258image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 92167
 
2.1%
3 92167
 
2.1%
4 92167
 
2.1%
5 92167
 
2.1%
6 92167
 
2.1%
7 92167
 
2.1%
8 92167
 
2.1%
9 92167
 
2.1%
10 92167
 
2.1%
11 92167
 
2.1%
Other values (2) 184334
 
4.2%
(Missing) 3280058
74.8%
ValueCountFrequency (%)
1 92167
2.1%
2 92167
2.1%
3 92167
2.1%
4 92167
2.1%
5 92167
2.1%
ValueCountFrequency (%)
12 92167
2.1%
11 92167
2.1%
10 92167
2.1%
9 92167
2.1%
8 92167
2.1%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing288770
Missing (%)6.6%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:38.014293image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052951
Coefficient of variation (CV)0.5310850694
Kurtosis-1.216783237
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum26632398
Variance11.91666958
MonotonicityNot monotonic
2024-02-13T20:44:38.388067image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 341441
7.8%
3 341441
7.8%
4 341441
7.8%
5 341441
7.8%
6 341441
7.8%
7 341441
7.8%
8 341441
7.8%
9 341441
7.8%
10 341441
7.8%
11 341441
7.8%
Other values (2) 682882
15.6%
ValueCountFrequency (%)
1 341441
7.8%
2 341441
7.8%
3 341441
7.8%
4 341441
7.8%
5 341441
7.8%
ValueCountFrequency (%)
12 341441
7.8%
11 341441
7.8%
10 341441
7.8%
9 341441
7.8%
8 341441
7.8%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct26436
Distinct (%)4.0%
Missing3725455
Missing (%)84.9%
Infinite0
Infinite (%)0.0%
Mean924.5240739
Minimum0
Maximum8737926
Zeros628168
Zeros (%)14.3%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:38.530943image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8737926
Range8737926
Interquartile range (IQR)0

Descriptive statistics

Standard deviation38970.96235
Coefficient of variation (CV)42.15245817
Kurtosis13629.83893
Mean924.5240739
Median Absolute Deviation (MAD)0
Skewness105.9926051
Sum610747074.9
Variance1518735906
MonotonicityNot monotonic
2024-02-13T20:44:38.717934image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 628168
 
14.3%
99.8 89
 
< 0.1%
10 52
 
< 0.1%
14 45
 
< 0.1%
400 31
 
< 0.1%
0.2 29
 
< 0.1%
10.400001 27
 
< 0.1%
4 26
 
< 0.1%
10500 24
 
< 0.1%
69126.055 24
 
< 0.1%
Other values (26426) 32092
 
0.7%
(Missing) 3725455
84.9%
ValueCountFrequency (%)
0 628168
14.3%
0.002 1
 
< 0.1%
0.004 6
 
< 0.1%
0.006 2
 
< 0.1%
0.008 4
 
< 0.1%
ValueCountFrequency (%)
8737926 1
< 0.1%
6350786 1
< 0.1%
5523147 1
< 0.1%
5315128 1
< 0.1%
5182516 1
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct170033
Distinct (%)8.7%
Missing2442535
Missing (%)55.7%
Infinite0
Infinite (%)0.0%
Mean4111.655686
Minimum0
Maximum17317146
Zeros1612253
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:38.875594image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile16963.4
Maximum17317146
Range17317146
Interquartile range (IQR)0

Descriptive statistics

Standard deviation64789.7787
Coefficient of variation (CV)15.75758858
Kurtosis45567.63126
Mean4111.655686
Median Absolute Deviation (MAD)0
Skewness186.8551368
Sum7991113840
Variance4197715424
MonotonicityNot monotonic
2024-02-13T20:44:39.038979image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1612253
36.8%
0.2 894
 
< 0.1%
1000 471
 
< 0.1%
0.4 373
 
< 0.1%
2000 333
 
< 0.1%
0.6 310
 
< 0.1%
3000 297
 
< 0.1%
0.8 294
 
< 0.1%
1.6 286
 
< 0.1%
2 278
 
< 0.1%
Other values (170023) 327738
 
7.5%
(Missing) 2442535
55.7%
ValueCountFrequency (%)
0 1612253
36.8%
0.002 27
 
< 0.1%
0.004 18
 
< 0.1%
0.006 16
 
< 0.1%
0.008 19
 
< 0.1%
ValueCountFrequency (%)
17317146 1
< 0.1%
17230230 1
< 0.1%
17141206 1
< 0.1%
17051322 1
< 0.1%
16963518 1
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing3280058
Missing (%)74.8%
Infinite0
Infinite (%)0.0%
Mean2019.388432
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:39.173517image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2018
Q12019
median2020
Q32020
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8013438654
Coefficient of variation (CV)0.0003968250253
Kurtosis-0.697380051
Mean2019.388432
Median Absolute Deviation (MAD)1
Skewness-0.3462998149
Sum2233451683
Variance0.6421519906
MonotonicityNot monotonic
2024-02-13T20:44:39.289554image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 520011
 
11.9%
2019 362429
 
8.3%
2018 179094
 
4.1%
2021 44413
 
1.0%
2017 35
 
< 0.1%
2016 22
 
< 0.1%
(Missing) 3280058
74.8%
ValueCountFrequency (%)
2016 22
 
< 0.1%
2017 35
 
< 0.1%
2018 179094
 
4.1%
2019 362429
8.3%
2020 520011
11.9%
ValueCountFrequency (%)
2021 44413
 
1.0%
2020 520011
11.9%
2019 362429
8.3%
2018 179094
 
4.1%
2017 35
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct20
Distinct (%)< 0.1%
Missing288770
Missing (%)6.6%
Infinite0
Infinite (%)0.0%
Mean2015.066429
Minimum2002
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:39.406553image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2002
5-th percentile2007
Q12012
median2016
Q32018
95-th percentile2020
Maximum2021
Range19
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.072742933
Coefficient of variation (CV)0.002021145743
Kurtosis-0.4521212835
Mean2015.066429
Median Absolute Deviation (MAD)3
Skewness-0.76809658
Sum8256315557
Variance16.587235
MonotonicityNot monotonic
2024-02-13T20:44:39.538556image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
2019 603853
13.8%
2018 583318
13.3%
2017 432657
9.9%
2020 322828
7.4%
2016 316104
7.2%
2015 281667
 
6.4%
2014 263154
 
6.0%
2013 229842
 
5.2%
2012 188788
 
4.3%
2011 163820
 
3.7%
Other values (10) 711261
16.2%
(Missing) 288770
6.6%
ValueCountFrequency (%)
2002 88
 
< 0.1%
2003 316
 
< 0.1%
2004 7827
 
0.2%
2005 39143
 
0.9%
2006 99161
2.3%
ValueCountFrequency (%)
2021 24768
 
0.6%
2020 322828
7.4%
2019 603853
13.8%
2018 583318
13.3%
2017 432657
9.9%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:39.721584image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters35088496
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4210385
96.0%
ab3c25cf 172826
 
3.9%
15f04f45 1431
 
< 0.1%
be4fd70b 917
 
< 0.1%
daf49a8a 490
 
< 0.1%
71ddaa88 9
 
< 0.1%
0c42a10e 2
 
< 0.1%
9ba4314a 2
 
< 0.1%
2024-02-13T20:44:40.007931image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 12806843
36.5%
b 4385047
 
12.5%
a 4384705
 
12.5%
4 4214660
 
12.0%
1 4211829
 
12.0%
7 4211311
 
12.0%
c 345654
 
1.0%
f 177095
 
0.5%
3 172828
 
0.5%
2 172828
 
0.5%
Other values (5) 5696
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25793651
73.5%
Lowercase Letter 9294845
 
26.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 12806843
49.7%
4 4214660
 
16.3%
1 4211829
 
16.3%
7 4211311
 
16.3%
3 172828
 
0.7%
2 172828
 
0.7%
0 2352
 
< 0.1%
8 508
 
< 0.1%
9 492
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 4385047
47.2%
a 4384705
47.2%
c 345654
 
3.7%
f 177095
 
1.9%
d 1425
 
< 0.1%
e 919
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 25793651
73.5%
Latin 9294845
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 12806843
49.7%
4 4214660
 
16.3%
1 4211829
 
16.3%
7 4211311
 
16.3%
3 172828
 
0.7%
2 172828
 
0.7%
0 2352
 
< 0.1%
8 508
 
< 0.1%
9 492
 
< 0.1%
Latin
ValueCountFrequency (%)
b 4385047
47.2%
a 4384705
47.2%
c 345654
 
3.7%
f 177095
 
1.9%
d 1425
 
< 0.1%
e 919
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 12806843
36.5%
b 4385047
 
12.5%
a 4384705
 
12.5%
4 4214660
 
12.0%
1 4211829
 
12.0%
7 4211311
 
12.0%
c 345654
 
1.0%
f 177095
 
0.5%
3 172828
 
0.5%
2 172828
 
0.5%
Other values (5) 5696
 
< 0.1%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.5 MiB
2024-02-13T20:44:40.163728image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.00001596
Min length8

Characters and Unicode

Total characters35088566
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 4341390
99.0%
ab3c25cf 43626
 
1.0%
be4fd70b 408
 
< 0.1%
daf49a8a 299
 
< 0.1%
15f04f45 269
 
< 0.1%
p28_48_88 70
 
< 0.1%
2024-02-13T20:44:40.446251image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 13068334
37.2%
a 4385913
 
12.5%
b 4385832
 
12.5%
4 4342705
 
12.4%
7 4341798
 
12.4%
1 4341659
 
12.4%
c 87252
 
0.2%
f 44871
 
0.1%
2 43696
 
0.1%
3 43626
 
0.1%
Other values (7) 2880
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26183373
74.6%
Lowercase Letter 8904983
 
25.4%
Connector Punctuation 140
 
< 0.1%
Uppercase Letter 70
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 13068334
49.9%
4 4342705
 
16.6%
7 4341798
 
16.6%
1 4341659
 
16.6%
2 43696
 
0.2%
3 43626
 
0.2%
0 677
 
< 0.1%
8 579
 
< 0.1%
9 299
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4385913
49.3%
b 4385832
49.3%
c 87252
 
1.0%
f 44871
 
0.5%
d 707
 
< 0.1%
e 408
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 140
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 70
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26183513
74.6%
Latin 8905053
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 13068334
49.9%
4 4342705
 
16.6%
7 4341798
 
16.6%
1 4341659
 
16.6%
2 43696
 
0.2%
3 43626
 
0.2%
0 677
 
< 0.1%
8 579
 
< 0.1%
9 299
 
< 0.1%
_ 140
 
< 0.1%
Latin
ValueCountFrequency (%)
a 4385913
49.3%
b 4385832
49.3%
c 87252
 
1.0%
f 44871
 
0.5%
d 707
 
< 0.1%
e 408
 
< 0.1%
P 70
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35088566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 13068334
37.2%
a 4385913
 
12.5%
b 4385832
 
12.5%
4 4342705
 
12.4%
7 4341798
 
12.4%
1 4341659
 
12.4%
c 87252
 
0.2%
f 44871
 
0.1%
2 43696
 
0.1%
3 43626
 
0.1%
Other values (7) 2880
 
< 0.1%