Overview

Dataset statistics

Number of variables19
Number of observations13927071
Missing cells88764540
Missing cells (%)33.5%
Total size in memory2.0 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 13778078 (98.9%) missing valuesMissing
collater_valueofguarantee_876L has 13373863 (96.0%) missing valuesMissing
pmts_dpd_1073P has 11883298 (85.3%) missing valuesMissing
pmts_dpd_303P has 7816731 (56.1%) missing valuesMissing
pmts_month_158T has 10075311 (72.3%) missing valuesMissing
pmts_month_706T has 1040751 (7.5%) missing valuesMissing
pmts_overdue_1140A has 11871027 (85.2%) missing valuesMissing
pmts_overdue_1152A has 7809419 (56.1%) missing valuesMissing
pmts_year_1139T has 10075311 (72.3%) missing valuesMissing
pmts_year_507T has 1040751 (7.5%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 47.0809369)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 68.99623008)Skewed
pmts_dpd_1073P is highly skewed (γ1 = 36.69774178)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 137.8117801)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 314.5818664)Skewed
collater_valueofguarantee_1124L has 139355 (1.0%) zerosZeros
collater_valueofguarantee_876L has 504614 (3.6%) zerosZeros
num_group1 has 2482094 (17.8%) zerosZeros
num_group2 has 560081 (4.0%) zerosZeros
pmts_dpd_1073P has 1958853 (14.1%) zerosZeros
pmts_dpd_303P has 5124725 (36.8%) zerosZeros
pmts_overdue_1140A has 1969331 (14.1%) zerosZeros
pmts_overdue_1152A has 5090863 (36.6%) zerosZeros

Reproduction

Analysis started2024-02-13 19:51:41.397929
Analysis finished2024-02-13 19:52:06.212156
Duration24.81 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct77457
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1438578.907
Minimum51083
Maximum2688744
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:06.318193image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum51083
5-th percentile226210
Q1238681
median1852943
Q31870815
95-th percentile2685193
Maximum2688744
Range2637661
Interquartile range (IQR)1632134

Descriptive statistics

Standard deviation806896.0686
Coefficient of variation (CV)0.5608980255
Kurtosis-1.045670522
Mean1438578.907
Median Absolute Deviation (MAD)25736
Skewness-0.4369387253
Sum2.003519058 × 1013
Variance6.510812655 × 1011
MonotonicityIncreasing
2024-02-13T20:52:06.548819image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1863457 2772
 
< 0.1%
1856049 2388
 
< 0.1%
1845383 2076
 
< 0.1%
236981 1944
 
< 0.1%
225313 1872
 
< 0.1%
52198 1668
 
< 0.1%
1845023 1668
 
< 0.1%
1849279 1608
 
< 0.1%
1846434 1584
 
< 0.1%
237301 1584
 
< 0.1%
Other values (77447) 13907907
99.9%
ValueCountFrequency (%)
51083 96
 
< 0.1%
51099 120
< 0.1%
51103 84
 
< 0.1%
51106 60
 
< 0.1%
51115 240
< 0.1%
ValueCountFrequency (%)
2688744 24
 
< 0.1%
2688743 156
< 0.1%
2688742 96
< 0.1%
2688741 96
< 0.1%
2688740 48
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:06.722830image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters111416568
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13778078
98.9%
9a0c095e 99894
 
0.7%
8fd95e4b 49026
 
0.4%
06fb9ba8 73
 
< 0.1%
2024-02-13T20:52:07.013411image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 41483154
37.2%
a 13878045
 
12.5%
b 13827250
 
12.4%
4 13827104
 
12.4%
7 13778078
 
12.4%
1 13778078
 
12.4%
9 248887
 
0.2%
0 199861
 
0.2%
e 148920
 
0.1%
c 99894
 
0.1%
Other values (4) 147297
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 83364334
74.8%
Lowercase Letter 28052234
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 41483154
49.8%
4 13827104
 
16.6%
7 13778078
 
16.5%
1 13778078
 
16.5%
9 248887
 
0.3%
0 199861
 
0.2%
8 49099
 
0.1%
6 73
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 13878045
49.5%
b 13827250
49.3%
e 148920
 
0.5%
c 99894
 
0.4%
f 49099
 
0.2%
d 49026
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 83364334
74.8%
Latin 28052234
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 41483154
49.8%
4 13827104
 
16.6%
7 13778078
 
16.5%
1 13778078
 
16.5%
9 248887
 
0.3%
0 199861
 
0.2%
8 49099
 
0.1%
6 73
 
< 0.1%
Latin
ValueCountFrequency (%)
a 13878045
49.5%
b 13827250
49.3%
e 148920
 
0.5%
c 99894
 
0.4%
f 49099
 
0.2%
d 49026
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 41483154
37.2%
a 13878045
 
12.5%
b 13827250
 
12.4%
4 13827104
 
12.4%
7 13778078
 
12.4%
1 13778078
 
12.4%
9 248887
 
0.2%
0 199861
 
0.2%
e 148920
 
0.1%
c 99894
 
0.1%
Other values (4) 147297
 
0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:07.169058image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters111416568
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13373863
96.0%
9a0c095e 299054
 
2.1%
8fd95e4b 253478
 
1.8%
06fb9ba8 568
 
< 0.1%
3cbe86ba 105
 
< 0.1%
c7a5ad39 2
 
< 0.1%
9276e4bb 1
 
< 0.1%
2024-02-13T20:52:07.460166image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 40674123
36.5%
a 13673594
 
12.3%
b 13628689
 
12.2%
4 13627342
 
12.2%
7 13373866
 
12.0%
1 13373863
 
12.0%
9 852157
 
0.8%
0 598676
 
0.5%
e 552638
 
0.5%
c 299161
 
0.3%
Other values (6) 762459
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 82754960
74.3%
Lowercase Letter 28661608
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 40674123
49.2%
4 13627342
 
16.5%
7 13373866
 
16.2%
1 13373863
 
16.2%
9 852157
 
1.0%
0 598676
 
0.7%
8 254151
 
0.3%
6 674
 
< 0.1%
3 107
 
< 0.1%
2 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 13673594
47.7%
b 13628689
47.6%
e 552638
 
1.9%
c 299161
 
1.0%
f 254046
 
0.9%
d 253480
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 82754960
74.3%
Latin 28661608
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 40674123
49.2%
4 13627342
 
16.5%
7 13373866
 
16.2%
1 13373863
 
16.2%
9 852157
 
1.0%
0 598676
 
0.7%
8 254151
 
0.3%
6 674
 
< 0.1%
3 107
 
< 0.1%
2 1
 
< 0.1%
Latin
ValueCountFrequency (%)
a 13673594
47.7%
b 13628689
47.6%
e 552638
 
1.9%
c 299161
 
1.0%
f 254046
 
0.9%
d 253480
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 40674123
36.5%
a 13673594
 
12.3%
b 13628689
 
12.2%
4 13627342
 
12.2%
7 13373866
 
12.0%
1 13373863
 
12.0%
9 852157
 
0.8%
0 598676
 
0.5%
e 552638
 
0.5%
c 299161
 
0.3%
Other values (6) 762459
 
0.7%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct6658
Distinct (%)4.5%
Missing13778078
Missing (%)98.9%
Infinite0
Infinite (%)0.0%
Mean1917304.433
Minimum0
Maximum3200000000
Zeros139355
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:07.637122image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2774896.824
Maximum3200000000
Range3200000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation36878205.82
Coefficient of variation (CV)19.23440283
Kurtosis3065.259628
Mean1917304.433
Median Absolute Deviation (MAD)0
Skewness47.0809369
Sum2.856649394 × 1011
Variance1.360002064 × 1015
MonotonicityNot monotonic
2024-02-13T20:52:07.822700image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 139355
 
1.0%
5000000 73
 
< 0.1%
200000000 68
 
< 0.1%
400000000 60
 
< 0.1%
1200000000 48
 
< 0.1%
3000000 42
 
< 0.1%
4000000 40
 
< 0.1%
10000000 39
 
< 0.1%
7000000 36
 
< 0.1%
6000000 35
 
< 0.1%
Other values (6648) 9197
 
0.1%
(Missing) 13778078
98.9%
ValueCountFrequency (%)
0 139355
1.0%
1 21
 
< 0.1%
383 1
 
< 0.1%
1484 1
 
< 0.1%
1866 1
 
< 0.1%
ValueCountFrequency (%)
3200000000 3
 
< 0.1%
3000000000 5
 
< 0.1%
1200000000 48
< 0.1%
1139827000 6
 
< 0.1%
983057000 6
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct17093
Distinct (%)3.1%
Missing13373863
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean1960659.196
Minimum0
Maximum6804986362
Zeros504614
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:07.978837image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile118744.55
Maximum6804986362
Range6804986362
Interquartile range (IQR)0

Descriptive statistics

Standard deviation69301291.8
Coefficient of variation (CV)35.34591425
Kurtosis5916.584668
Mean1960659.196
Median Absolute Deviation (MAD)0
Skewness68.99623008
Sum1.084652352 × 1012
Variance4.802669045 × 1015
MonotonicityNot monotonic
2024-02-13T20:52:08.141671image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 504614
 
3.6%
60000 1427
 
< 0.1%
130000 1061
 
< 0.1%
100000 1053
 
< 0.1%
50000 762
 
< 0.1%
65000 640
 
< 0.1%
300000 511
 
< 0.1%
70000 507
 
< 0.1%
80000 500
 
< 0.1%
150000 494
 
< 0.1%
Other values (17083) 41639
 
0.3%
(Missing) 13373863
96.0%
ValueCountFrequency (%)
0 504614
3.6%
0.14 1
 
< 0.1%
0.99 1
 
< 0.1%
1 159
 
< 0.1%
1.2 1
 
< 0.1%
ValueCountFrequency (%)
6804986362 32
 
< 0.1%
3250000000 39
 
< 0.1%
3200000000 8
 
< 0.1%
2000000000 101
< 0.1%
1200000000 46
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:08.323625image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters111416568
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13373863
96.0%
c7a5ad39 437459
 
3.1%
3cbe86ba 80058
 
0.6%
9276e4bb 10883
 
0.1%
0e63c0f0 8992
 
0.1%
168ad9f3 3545
 
< 0.1%
5224034a 2668
 
< 0.1%
940efad7 2520
 
< 0.1%
7b62420e 2338
 
< 0.1%
2fd21cf1 1539
 
< 0.1%
Other values (5) 3206
 
< 0.1%
2024-02-13T20:52:08.620610image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 40563772
36.4%
a 14340330
 
12.9%
7 13829035
 
12.4%
b 13559262
 
12.2%
4 13398592
 
12.0%
1 13381291
 
12.0%
3 533973
 
0.5%
c 530104
 
0.5%
9 456909
 
0.4%
d 445839
 
0.4%
Other values (6) 377461
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 82416509
74.0%
Lowercase Letter 29000059
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 40563772
49.2%
7 13829035
 
16.8%
4 13398592
 
16.3%
1 13381291
 
16.2%
3 533973
 
0.6%
9 456909
 
0.6%
6 106995
 
0.1%
8 85184
 
0.1%
0 36009
 
< 0.1%
2 24749
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 14340330
49.4%
b 13559262
46.8%
c 530104
 
1.8%
d 445839
 
1.5%
e 105596
 
0.4%
f 18928
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 82416509
74.0%
Latin 29000059
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 40563772
49.2%
7 13829035
 
16.8%
4 13398592
 
16.3%
1 13381291
 
16.2%
3 533973
 
0.6%
9 456909
 
0.6%
6 106995
 
0.1%
8 85184
 
0.1%
0 36009
 
< 0.1%
2 24749
 
< 0.1%
Latin
ValueCountFrequency (%)
a 14340330
49.4%
b 13559262
46.8%
c 530104
 
1.8%
d 445839
 
1.5%
e 105596
 
0.4%
f 18928
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 40563772
36.4%
a 14340330
 
12.9%
7 13829035
 
12.4%
b 13559262
 
12.2%
4 13398592
 
12.0%
1 13381291
 
12.0%
3 533973
 
0.5%
c 530104
 
0.5%
9 456909
 
0.4%
d 445839
 
0.4%
Other values (6) 377461
 
0.3%
Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:08.800609image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters111416568
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13778078
98.9%
c7a5ad39 136255
 
1.0%
9276e4bb 5003
 
< 0.1%
0e63c0f0 3727
 
< 0.1%
7b62420e 1393
 
< 0.1%
168ad9f3 1379
 
< 0.1%
940efad7 360
 
< 0.1%
f4d8a027 324
 
< 0.1%
2fd21cf1 193
 
< 0.1%
3cbe86ba 166
 
< 0.1%
Other values (4) 193
 
< 0.1%
2024-02-13T20:52:09.115937image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 41470681
37.2%
a 14052918
 
12.6%
7 13921507
 
12.5%
b 13789902
 
12.4%
4 13785359
 
12.4%
1 13779935
 
12.4%
9 143067
 
0.1%
3 141627
 
0.1%
c 140468
 
0.1%
d 138511
 
0.1%
Other values (6) 52593
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 83277851
74.7%
Lowercase Letter 28138717
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 41470681
49.8%
7 13921507
 
16.7%
4 13785359
 
16.6%
1 13779935
 
16.5%
9 143067
 
0.2%
3 141627
 
0.2%
0 13324
 
< 0.1%
6 11761
 
< 0.1%
2 8629
 
< 0.1%
8 1961
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 14052918
49.9%
b 13789902
49.0%
c 140468
 
0.5%
d 138511
 
0.5%
e 10741
 
< 0.1%
f 6177
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 83277851
74.7%
Latin 28138717
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 41470681
49.8%
7 13921507
 
16.7%
4 13785359
 
16.6%
1 13779935
 
16.5%
9 143067
 
0.2%
3 141627
 
0.2%
0 13324
 
< 0.1%
6 11761
 
< 0.1%
2 8629
 
< 0.1%
8 1961
 
< 0.1%
Latin
ValueCountFrequency (%)
a 14052918
49.9%
b 13789902
49.0%
c 140468
 
0.5%
d 138511
 
0.5%
e 10741
 
< 0.1%
f 6177
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 41470681
37.2%
a 14052918
 
12.6%
7 13921507
 
12.5%
b 13789902
 
12.4%
4 13785359
 
12.4%
1 13779935
 
12.4%
9 143067
 
0.1%
3 141627
 
0.1%
c 140468
 
0.1%
d 138511
 
0.1%
Other values (6) 52593
 
< 0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct198
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.265338204
Minimum0
Maximum197
Zeros2482094
Zeros (%)17.8%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:09.272724image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q37
95-th percentile17
Maximum197
Range197
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.875070202
Coefficient of variation (CV)1.305722431
Kurtosis57.95678216
Mean5.265338204
Median Absolute Deviation (MAD)3
Skewness4.909283745
Sum73330739
Variance47.26659028
MonotonicityNot monotonic
2024-02-13T20:52:09.447686image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2482094
17.8%
1 1935612
13.9%
2 1542782
11.1%
3 1270423
9.1%
4 1064545
7.6%
5 898181
 
6.4%
6 761863
 
5.5%
7 641975
 
4.6%
8 539828
 
3.9%
9 453673
 
3.3%
Other values (188) 2336095
16.8%
ValueCountFrequency (%)
0 2482094
17.8%
1 1935612
13.9%
2 1542782
11.1%
3 1270423
9.1%
4 1064545
7.6%
ValueCountFrequency (%)
197 12
< 0.1%
196 12
< 0.1%
195 12
< 0.1%
194 12
< 0.1%
193 12
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct40
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.55590016
Minimum0
Maximum39
Zeros560081
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:09.602401image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum39
Range39
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.381094055
Coefficient of variation (CV)0.6920303295
Kurtosis-0.7071473224
Mean13.55590016
Median Absolute Deviation (MAD)7
Skewness0.4768722509
Sum188793984
Variance88.00492566
MonotonicityNot monotonic
2024-02-13T20:52:09.766375image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
0 560081
 
4.0%
1 560031
 
4.0%
2 560030
 
4.0%
7 560030
 
4.0%
3 560030
 
4.0%
9 560030
 
4.0%
8 560030
 
4.0%
11 560030
 
4.0%
6 560030
 
4.0%
5 560030
 
4.0%
Other values (30) 8326719
59.8%
ValueCountFrequency (%)
0 560081
4.0%
1 560031
4.0%
2 560030
4.0%
3 560030
4.0%
4 560030
4.0%
ValueCountFrequency (%)
39 1
 
< 0.1%
38 1
 
< 0.1%
37 1
 
< 0.1%
36 1
 
< 0.1%
35 178580
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct2146
Distinct (%)0.1%
Missing11883298
Missing (%)85.3%
Infinite0
Infinite (%)0.0%
Mean3.107236469
Minimum0
Maximum4520
Zeros1958853
Zeros (%)14.1%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:09.971418image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4520
Range4520
Interquartile range (IQR)0

Descriptive statistics

Standard deviation70.10703701
Coefficient of variation (CV)22.56250456
Kurtosis1614.321289
Mean3.107236469
Median Absolute Deviation (MAD)0
Skewness36.69774178
Sum6350486
Variance4914.996638
MonotonicityNot monotonic
2024-02-13T20:52:10.138235image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1958853
 
14.1%
1 17075
 
0.1%
2 6974
 
0.1%
3 6007
 
< 0.1%
4 5304
 
< 0.1%
7 3365
 
< 0.1%
5 3236
 
< 0.1%
6 2993
 
< 0.1%
8 2758
 
< 0.1%
10 2533
 
< 0.1%
Other values (2136) 34675
 
0.2%
(Missing) 11883298
85.3%
ValueCountFrequency (%)
0 1958853
14.1%
1 17075
 
0.1%
2 6974
 
0.1%
3 6007
 
< 0.1%
4 5304
 
< 0.1%
ValueCountFrequency (%)
4520 1
< 0.1%
4497 1
< 0.1%
4460 1
< 0.1%
4446 1
< 0.1%
4415 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  ZEROS 

Distinct4145
Distinct (%)0.1%
Missing7816731
Missing (%)56.1%
Infinite0
Infinite (%)0.0%
Mean58.6952402
Minimum-9
Maximum84575
Zeros5124725
Zeros (%)36.8%
Negative779
Negative (%)< 0.1%
Memory size106.3 MiB
2024-02-13T20:52:10.288391image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-9
5-th percentile0
Q10
median0
Q30
95-th percentile378
Maximum84575
Range84584
Interquartile range (IQR)0

Descriptive statistics

Standard deviation279.0666744
Coefficient of variation (CV)4.754502639
Kurtosis2883.58435
Mean58.6952402
Median Absolute Deviation (MAD)0
Skewness16.20937066
Sum358647874
Variance77878.20878
MonotonicityNot monotonic
2024-02-13T20:52:10.442393image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5124725
36.8%
1 137651
 
1.0%
3 35474
 
0.3%
2 33307
 
0.2%
4 29262
 
0.2%
6 24181
 
0.2%
7 19331
 
0.1%
5 19320
 
0.1%
9 14088
 
0.1%
8 13653
 
0.1%
Other values (4135) 659348
 
4.7%
(Missing) 7816731
56.1%
ValueCountFrequency (%)
-9 6
< 0.1%
-8 7
< 0.1%
-7 2
 
< 0.1%
-6 12
< 0.1%
-5 3
 
< 0.1%
ValueCountFrequency (%)
84575 1
 
< 0.1%
84573 1
 
< 0.1%
29250 4
< 0.1%
5184 1
 
< 0.1%
5164 1
 
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing10075311
Missing (%)72.3%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:10.570534image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052978
Coefficient of variation (CV)0.5310850735
Kurtosis-1.216783239
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum25036440
Variance11.91666976
MonotonicityNot monotonic
2024-02-13T20:52:10.692545image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 320980
 
2.3%
3 320980
 
2.3%
4 320980
 
2.3%
5 320980
 
2.3%
6 320980
 
2.3%
7 320980
 
2.3%
8 320980
 
2.3%
9 320980
 
2.3%
10 320980
 
2.3%
11 320980
 
2.3%
Other values (2) 641960
 
4.6%
(Missing) 10075311
72.3%
ValueCountFrequency (%)
1 320980
2.3%
2 320980
2.3%
3 320980
2.3%
4 320980
2.3%
5 320980
2.3%
ValueCountFrequency (%)
12 320980
2.3%
11 320980
2.3%
10 320980
2.3%
9 320980
2.3%
8 320980
2.3%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing1040751
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:10.814776image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052663
Coefficient of variation (CV)0.5310850252
Kurtosis-1.216783223
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum83761080
Variance11.91666759
MonotonicityNot monotonic
2024-02-13T20:52:10.936774image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 1073860
7.7%
3 1073860
7.7%
4 1073860
7.7%
5 1073860
7.7%
6 1073860
7.7%
7 1073860
7.7%
8 1073860
7.7%
9 1073860
7.7%
10 1073860
7.7%
11 1073860
7.7%
Other values (2) 2147720
15.4%
ValueCountFrequency (%)
1 1073860
7.7%
2 1073860
7.7%
3 1073860
7.7%
4 1073860
7.7%
5 1073860
7.7%
ValueCountFrequency (%)
12 1073860
7.7%
11 1073860
7.7%
10 1073860
7.7%
9 1073860
7.7%
8 1073860
7.7%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct69983
Distinct (%)3.4%
Missing11871027
Missing (%)85.2%
Infinite0
Infinite (%)0.0%
Mean599.1327471
Minimum0
Maximum5881917.5
Zeros1969331
Zeros (%)14.1%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:11.082775image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5881917.5
Range5881917.5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation30430.77891
Coefficient of variation (CV)50.79137981
Kurtosis22225.41324
Mean599.1327471
Median Absolute Deviation (MAD)0
Skewness137.8117801
Sum1231843290
Variance926032305.1
MonotonicityNot monotonic
2024-02-13T20:52:11.247033image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1969331
 
14.1%
10 207
 
< 0.1%
14 111
 
< 0.1%
400 96
 
< 0.1%
99.8 94
 
< 0.1%
0.2 89
 
< 0.1%
4 80
 
< 0.1%
0.4 80
 
< 0.1%
1.6 71
 
< 0.1%
1.2 66
 
< 0.1%
Other values (69973) 85819
 
0.6%
(Missing) 11871027
85.2%
ValueCountFrequency (%)
0 1969331
14.1%
0.002 10
 
< 0.1%
0.004 6
 
< 0.1%
0.006 4
 
< 0.1%
0.008 9
 
< 0.1%
ValueCountFrequency (%)
5881917.5 17
< 0.1%
5881667.5 1
 
< 0.1%
5881552.5 1
 
< 0.1%
5881277.5 3
 
< 0.1%
5878422 2
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct448986
Distinct (%)7.3%
Missing7809419
Missing (%)56.1%
Infinite0
Infinite (%)0.0%
Mean3954.427561
Minimum0
Maximum51793576
Zeros5090863
Zeros (%)36.6%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:11.406878image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile16063.50675
Maximum51793576
Range51793576
Interquartile range (IQR)0

Descriptive statistics

Standard deviation87539.96896
Coefficient of variation (CV)22.13720383
Kurtosis142902.378
Mean3954.427561
Median Absolute Deviation (MAD)0
Skewness314.5818664
Sum2.419181168 × 1010
Variance7663246166
MonotonicityNot monotonic
2024-02-13T20:52:11.841512image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5090863
36.6%
0.2 2936
 
< 0.1%
1000 1581
 
< 0.1%
0.4 1453
 
< 0.1%
0.8 1110
 
< 0.1%
2000 1086
 
< 0.1%
2 1025
 
< 0.1%
1.6 1005
 
< 0.1%
0.6 930
 
< 0.1%
1 927
 
< 0.1%
Other values (448976) 1014736
 
7.3%
(Missing) 7809419
56.1%
ValueCountFrequency (%)
0 5090863
36.6%
0.002 106
 
< 0.1%
0.004 36
 
< 0.1%
0.006 56
 
< 0.1%
0.008 40
 
< 0.1%
ValueCountFrequency (%)
51793576 1
< 0.1%
51169692 1
< 0.1%
49788236 1
< 0.1%
48674156 1
< 0.1%
47470956 1
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing10075311
Missing (%)72.3%
Infinite0
Infinite (%)0.0%
Mean2019.364831
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:11.983186image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2018
Q12019
median2019
Q32020
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7904653126
Coefficient of variation (CV)0.000391442547
Kurtosis-0.6875733162
Mean2019.364831
Median Absolute Deviation (MAD)1
Skewness-0.2852906722
Sum7778108680
Variance0.6248354104
MonotonicityNot monotonic
2024-02-13T20:52:12.114347image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 1724349
 
12.4%
2019 1370885
 
9.8%
2018 610446
 
4.4%
2021 145888
 
1.0%
2017 137
 
< 0.1%
2016 55
 
< 0.1%
(Missing) 10075311
72.3%
ValueCountFrequency (%)
2016 55
 
< 0.1%
2017 137
 
< 0.1%
2018 610446
 
4.4%
2019 1370885
9.8%
2020 1724349
12.4%
ValueCountFrequency (%)
2021 145888
 
1.0%
2020 1724349
12.4%
2019 1370885
9.8%
2018 610446
 
4.4%
2017 137
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct21
Distinct (%)< 0.1%
Missing1040751
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean2014.879039
Minimum2001
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:12.234226image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2001
5-th percentile2007
Q12012
median2016
Q32018
95-th percentile2020
Maximum2021
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.02004628
Coefficient of variation (CV)0.001995179959
Kurtosis-0.4722344126
Mean2014.879039
Median Absolute Deviation (MAD)3
Skewness-0.7492984691
Sum2.596437606 × 1010
Variance16.16077209
MonotonicityNot monotonic
2024-02-13T20:52:12.370236image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2018 1882027
13.5%
2019 1801228
12.9%
2017 1445462
10.4%
2016 1061050
7.6%
2015 945567
6.8%
2014 869514
 
6.2%
2013 751105
 
5.4%
2020 707133
 
5.1%
2012 614324
 
4.4%
2011 531763
 
3.8%
Other values (11) 2277147
16.4%
(Missing) 1040751
7.5%
ValueCountFrequency (%)
2001 44
 
< 0.1%
2002 257
 
< 0.1%
2003 760
 
< 0.1%
2004 25543
 
0.2%
2005 126528
0.9%
ValueCountFrequency (%)
2021 50720
 
0.4%
2020 707133
 
5.1%
2019 1801228
12.9%
2018 1882027
13.5%
2017 1445462
10.4%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:12.530883image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters111416568
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13377486
96.1%
ab3c25cf 541038
 
3.9%
15f04f45 4328
 
< 0.1%
be4fd70b 2821
 
< 0.1%
daf49a8a 1355
 
< 0.1%
0c42a10e 20
 
< 0.1%
71ddaa88 14
 
< 0.1%
1d94eac1 6
 
< 0.1%
652d52e3 2
 
< 0.1%
9ba4314a 1
 
< 0.1%
2024-02-13T20:52:12.814990image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 40682156
36.5%
b 13924167
 
12.5%
a 13922645
 
12.5%
4 13390346
 
12.0%
1 13381861
 
12.0%
7 13380321
 
12.0%
c 1082102
 
1.0%
f 553870
 
0.5%
2 541062
 
0.5%
3 541041
 
0.5%
Other values (6) 16997
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 81926723
73.5%
Lowercase Letter 29489845
 
26.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 40682156
49.7%
4 13390346
 
16.3%
1 13381861
 
16.3%
7 13380321
 
16.3%
2 541062
 
0.7%
3 541041
 
0.7%
0 7189
 
< 0.1%
8 1383
 
< 0.1%
9 1362
 
< 0.1%
6 2
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 13924167
47.2%
a 13922645
47.2%
c 1082102
 
3.7%
f 553870
 
1.9%
d 4212
 
< 0.1%
e 2849
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 81926723
73.5%
Latin 29489845
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 40682156
49.7%
4 13390346
 
16.3%
1 13381861
 
16.3%
7 13380321
 
16.3%
2 541062
 
0.7%
3 541041
 
0.7%
0 7189
 
< 0.1%
8 1383
 
< 0.1%
9 1362
 
< 0.1%
6 2
 
< 0.1%
Latin
ValueCountFrequency (%)
b 13924167
47.2%
a 13922645
47.2%
c 1082102
 
3.7%
f 553870
 
1.9%
d 4212
 
< 0.1%
e 2849
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 40682156
36.5%
b 13924167
 
12.5%
a 13922645
 
12.5%
4 13390346
 
12.0%
1 13381861
 
12.0%
7 13380321
 
12.0%
c 1082102
 
1.0%
f 553870
 
0.5%
2 541062
 
0.5%
3 541041
 
0.5%
Other values (6) 16997
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size106.3 MiB
2024-02-13T20:52:12.987824image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.00001594
Min length8

Characters and Unicode

Total characters111416790
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 13780319
98.9%
ab3c25cf 143426
 
1.0%
be4fd70b 1339
 
< 0.1%
15f04f45 894
 
< 0.1%
daf49a8a 870
 
< 0.1%
p28_48_88 222
 
< 0.1%
71ddaa88 1
 
< 0.1%
2024-02-13T20:52:13.289102image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 41486171
37.2%
b 13926423
 
12.5%
a 13926357
 
12.5%
4 13784538
 
12.4%
7 13781659
 
12.4%
1 13781214
 
12.4%
c 286852
 
0.3%
f 147423
 
0.1%
2 143648
 
0.1%
3 143426
 
0.1%
Other values (7) 9079
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 83125519
74.6%
Lowercase Letter 28290605
 
25.4%
Connector Punctuation 444
 
< 0.1%
Uppercase Letter 222
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 41486171
49.9%
4 13784538
 
16.6%
7 13781659
 
16.6%
1 13781214
 
16.6%
2 143648
 
0.2%
3 143426
 
0.2%
0 2233
 
< 0.1%
8 1760
 
< 0.1%
9 870
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 13926423
49.2%
a 13926357
49.2%
c 286852
 
1.0%
f 147423
 
0.5%
d 2211
 
< 0.1%
e 1339
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 444
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 222
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 83125963
74.6%
Latin 28290827
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 41486171
49.9%
4 13784538
 
16.6%
7 13781659
 
16.6%
1 13781214
 
16.6%
2 143648
 
0.2%
3 143426
 
0.2%
0 2233
 
< 0.1%
8 1760
 
< 0.1%
9 870
 
< 0.1%
_ 444
 
< 0.1%
Latin
ValueCountFrequency (%)
b 13926423
49.2%
a 13926357
49.2%
c 286852
 
1.0%
f 147423
 
0.5%
d 2211
 
< 0.1%
e 1339
 
< 0.1%
P 222
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 111416790
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 41486171
37.2%
b 13926423
 
12.5%
a 13926357
 
12.5%
4 13784538
 
12.4%
7 13781659
 
12.4%
1 13781214
 
12.4%
c 286852
 
0.3%
f 147423
 
0.1%
2 143648
 
0.1%
3 143426
 
0.1%
Other values (7) 9079
 
< 0.1%