Overview

Dataset statistics

Number of variables19
Number of observations26563901
Missing cells168378635
Missing cells (%)33.4%
Total size in memory3.8 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 26206307 (98.7%) missing valuesMissing
collater_valueofguarantee_876L has 25552760 (96.2%) missing valuesMissing
pmts_dpd_1073P has 21941140 (82.6%) missing valuesMissing
pmts_dpd_303P has 15400316 (58.0%) missing valuesMissing
pmts_month_158T has 18102749 (68.1%) missing valuesMissing
pmts_month_706T has 2878073 (10.8%) missing valuesMissing
pmts_overdue_1140A has 21927579 (82.5%) missing valuesMissing
pmts_overdue_1152A has 15388889 (57.9%) missing valuesMissing
pmts_year_1139T has 18102749 (68.1%) missing valuesMissing
pmts_year_507T has 2878073 (10.8%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 129.5424267)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 80.45443074)Skewed
pmts_dpd_303P is highly skewed (γ1 = 51.3707085)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 391.4570011)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 567.0543363)Skewed
collater_valueofguarantee_1124L has 332596 (1.3%) zerosZeros
collater_valueofguarantee_876L has 895899 (3.4%) zerosZeros
num_group1 has 5951316 (22.4%) zerosZeros
num_group2 has 1059354 (4.0%) zerosZeros
pmts_dpd_1073P has 4352536 (16.4%) zerosZeros
pmts_dpd_303P has 9369638 (35.3%) zerosZeros
pmts_overdue_1140A has 4361805 (16.4%) zerosZeros
pmts_overdue_1152A has 9299046 (35.0%) zerosZeros

Reproduction

Analysis started2024-02-13 19:45:49.664532
Analysis finished2024-02-13 19:47:16.517847
Duration1 minute and 26.85 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct190486
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1393445.35
Minimum21161
Maximum2619253
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:16.726032image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum21161
5-th percentile146140
Q1786082
median1475261
Q31516336
95-th percentile2611496
Maximum2619253
Range2598092
Interquartile range (IQR)730254

Descriptive statistics

Standard deviation717031.5068
Coefficient of variation (CV)0.5145745451
Kurtosis-0.3057720545
Mean1393445.35
Median Absolute Deviation (MAD)45862
Skewness0.02311979714
Sum3.701534432 × 1013
Variance5.141341818 × 1011
MonotonicityIncreasing
2024-02-13T20:47:16.923878image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1447318 4257
 
< 0.1%
1491009 3420
 
< 0.1%
749548 3096
 
< 0.1%
796027 2280
 
< 0.1%
1467304 1974
 
< 0.1%
744230 1710
 
< 0.1%
1511666 1692
 
< 0.1%
2611718 1680
 
< 0.1%
1519099 1620
 
< 0.1%
2601063 1572
 
< 0.1%
Other values (190476) 26540600
99.9%
ValueCountFrequency (%)
21161 36
< 0.1%
21654 72
< 0.1%
21739 12
 
< 0.1%
21746 24
 
< 0.1%
21758 36
< 0.1%
ValueCountFrequency (%)
2619253 252
< 0.1%
2619252 48
 
< 0.1%
2619251 84
 
< 0.1%
2619250 36
 
< 0.1%
2619249 24
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:17.140071image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters212511208
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26206307
98.7%
9a0c095e 251105
 
0.9%
8fd95e4b 106318
 
0.4%
06fb9ba8 171
 
< 0.1%
2024-02-13T20:47:17.445532image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78976344
37.2%
a 26457583
 
12.4%
b 26312967
 
12.4%
4 26312625
 
12.4%
7 26206307
 
12.3%
1 26206307
 
12.3%
9 608699
 
0.3%
0 502381
 
0.2%
e 357423
 
0.2%
c 251105
 
0.1%
Other values (4) 319467
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 158919323
74.8%
Lowercase Letter 53591885
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78976344
49.7%
4 26312625
 
16.6%
7 26206307
 
16.5%
1 26206307
 
16.5%
9 608699
 
0.4%
0 502381
 
0.3%
8 106489
 
0.1%
6 171
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26457583
49.4%
b 26312967
49.1%
e 357423
 
0.7%
c 251105
 
0.5%
f 106489
 
0.2%
d 106318
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 158919323
74.8%
Latin 53591885
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78976344
49.7%
4 26312625
 
16.6%
7 26206307
 
16.5%
1 26206307
 
16.5%
9 608699
 
0.4%
0 502381
 
0.3%
8 106489
 
0.1%
6 171
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26457583
49.4%
b 26312967
49.1%
e 357423
 
0.7%
c 251105
 
0.5%
f 106489
 
0.2%
d 106318
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78976344
37.2%
a 26457583
 
12.4%
b 26312967
 
12.4%
4 26312625
 
12.4%
7 26206307
 
12.3%
1 26206307
 
12.3%
9 608699
 
0.3%
0 502381
 
0.2%
e 357423
 
0.2%
c 251105
 
0.1%
Other values (4) 319467
 
0.2%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:17.623028image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters212511208
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25552760
96.2%
9a0c095e 566759
 
2.1%
8fd95e4b 442713
 
1.7%
06fb9ba8 1409
 
< 0.1%
3cbe86ba 249
 
< 0.1%
c7a5ad39 6
 
< 0.1%
9276e4bb 5
 
< 0.1%
2024-02-13T20:47:17.915745image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 77667758
36.5%
a 26121189
 
12.3%
b 25998799
 
12.2%
4 25995478
 
12.2%
7 25552771
 
12.0%
1 25552760
 
12.0%
9 1577651
 
0.7%
0 1134927
 
0.5%
e 1009726
 
0.5%
c 567014
 
0.3%
Other values (6) 1333135
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 157927639
74.3%
Lowercase Letter 54583569
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 77667758
49.2%
4 25995478
 
16.5%
7 25552771
 
16.2%
1 25552760
 
16.2%
9 1577651
 
1.0%
0 1134927
 
0.7%
8 444371
 
0.3%
6 1663
 
< 0.1%
3 255
 
< 0.1%
2 5
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26121189
47.9%
b 25998799
47.6%
e 1009726
 
1.8%
c 567014
 
1.0%
f 444122
 
0.8%
d 442719
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 157927639
74.3%
Latin 54583569
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 77667758
49.2%
4 25995478
 
16.5%
7 25552771
 
16.2%
1 25552760
 
16.2%
9 1577651
 
1.0%
0 1134927
 
0.7%
8 444371
 
0.3%
6 1663
 
< 0.1%
3 255
 
< 0.1%
2 5
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26121189
47.9%
b 25998799
47.6%
e 1009726
 
1.8%
c 567014
 
1.0%
f 444122
 
0.8%
d 442719
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 77667758
36.5%
a 26121189
 
12.3%
b 25998799
 
12.2%
4 25995478
 
12.2%
7 25552771
 
12.0%
1 25552760
 
12.0%
9 1577651
 
0.7%
0 1134927
 
0.5%
e 1009726
 
0.5%
c 567014
 
0.3%
Other values (6) 1333135
 
0.6%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct16290
Distinct (%)4.6%
Missing26206307
Missing (%)98.7%
Infinite0
Infinite (%)0.0%
Mean1426868.755
Minimum0
Maximum7125000000
Zeros332596
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:18.072780image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3421453
Maximum7125000000
Range7125000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation42327150.07
Coefficient of variation (CV)29.66436115
Kurtosis20228.27979
Mean1426868.755
Median Absolute Deviation (MAD)0
Skewness129.5424267
Sum5.102397057 × 1011
Variance1.791587633 × 1015
MonotonicityNot monotonic
2024-02-13T20:47:18.912998image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 332596
 
1.3%
2000000 156
 
< 0.1%
3000000 149
 
< 0.1%
5000000 148
 
< 0.1%
4000000 130
 
< 0.1%
1 130
 
< 0.1%
10000000 116
 
< 0.1%
6000000 106
 
< 0.1%
2500000 93
 
< 0.1%
1000000 75
 
< 0.1%
Other values (16280) 23895
 
0.1%
(Missing) 26206307
98.7%
ValueCountFrequency (%)
0 332596
1.3%
1 130
 
< 0.1%
2 1
 
< 0.1%
35.55 1
 
< 0.1%
40.36 1
 
< 0.1%
ValueCountFrequency (%)
7125000000 8
< 0.1%
6105000000 1
 
< 0.1%
4850728000 1
 
< 0.1%
3200000000 5
< 0.1%
2180329000 2
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct35970
Distinct (%)3.6%
Missing25552760
Missing (%)96.2%
Infinite0
Infinite (%)0.0%
Mean3508955.841
Minimum0
Maximum1.4 × 1010
Zeros895899
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:19.093968image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile200000
Maximum1.4 × 1010
Range1.4 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation77098098.56
Coefficient of variation (CV)21.97180645
Kurtosis12302.16272
Mean3508955.841
Median Absolute Deviation (MAD)0
Skewness80.45443074
Sum3.548049118 × 1012
Variance5.944116801 × 1015
MonotonicityNot monotonic
2024-02-13T20:47:19.264584image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 895899
 
3.4%
60000 3074
 
< 0.1%
130000 2408
 
< 0.1%
100000 2191
 
< 0.1%
50000 1628
 
< 0.1%
65000 1449
 
< 0.1%
70000 1135
 
< 0.1%
300000 1052
 
< 0.1%
150000 1020
 
< 0.1%
200000 1013
 
< 0.1%
Other values (35960) 100272
 
0.4%
(Missing) 25552760
96.2%
ValueCountFrequency (%)
0 895899
3.4%
0.01 3
 
< 0.1%
0.02 18
 
< 0.1%
0.03 46
 
< 0.1%
0.04 10
 
< 0.1%
ValueCountFrequency (%)
1.4 × 101011
 
< 0.1%
4000000000 11
 
< 0.1%
3250000000 67
 
< 0.1%
3200000000 19
 
< 0.1%
2000000000 223
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:19.447642image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters212511208
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25552760
96.2%
c7a5ad39 753868
 
2.8%
3cbe86ba 181414
 
0.7%
9276e4bb 26820
 
0.1%
0e63c0f0 14664
 
0.1%
168ad9f3 7833
 
< 0.1%
7b62420e 5926
 
< 0.1%
5224034a 5614
 
< 0.1%
940efad7 5327
 
< 0.1%
2fd21cf1 2893
 
< 0.1%
Other values (5) 6782
 
< 0.1%
2024-02-13T20:47:19.792368image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 77421598
36.4%
a 27267414
 
12.8%
7 26348935
 
12.4%
b 25977588
 
12.2%
4 25610283
 
12.1%
1 25567577
 
12.0%
3 966031
 
0.5%
c 956675
 
0.5%
9 799124
 
0.4%
d 771631
 
0.4%
Other values (6) 824352
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 157267141
74.0%
Lowercase Letter 55244067
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 77421598
49.2%
7 26348935
 
16.8%
4 25610283
 
16.3%
1 25567577
 
16.3%
3 966031
 
0.6%
9 799124
 
0.5%
6 239091
 
0.2%
8 192155
 
0.1%
0 64951
 
< 0.1%
2 57396
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 27267414
49.4%
b 25977588
47.0%
c 956675
 
1.7%
d 771631
 
1.4%
e 235349
 
0.4%
f 35410
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 157267141
74.0%
Latin 55244067
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 77421598
49.2%
7 26348935
 
16.8%
4 25610283
 
16.3%
1 25567577
 
16.3%
3 966031
 
0.6%
9 799124
 
0.5%
6 239091
 
0.2%
8 192155
 
0.1%
0 64951
 
< 0.1%
2 57396
 
< 0.1%
Latin
ValueCountFrequency (%)
a 27267414
49.4%
b 25977588
47.0%
c 956675
 
1.7%
d 771631
 
1.4%
e 235349
 
0.4%
f 35410
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 77421598
36.4%
a 27267414
 
12.8%
7 26348935
 
12.4%
b 25977588
 
12.2%
4 25610283
 
12.1%
1 25567577
 
12.0%
3 966031
 
0.5%
c 956675
 
0.5%
9 799124
 
0.4%
d 771631
 
0.4%
Other values (6) 824352
 
0.4%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:19.960605image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters212511208
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26206307
98.7%
c7a5ad39 324608
 
1.2%
9276e4bb 13407
 
0.1%
0e63c0f0 9232
 
< 0.1%
7b62420e 3634
 
< 0.1%
168ad9f3 3413
 
< 0.1%
940efad7 948
 
< 0.1%
f4d8a027 634
 
< 0.1%
46ab00a7 487
 
< 0.1%
3cbe86ba 473
 
< 0.1%
Other values (5) 758
 
< 0.1%
2024-02-13T20:47:20.260882image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78943886
37.1%
a 26862261
 
12.6%
7 26550095
 
12.5%
b 26238255
 
12.3%
4 26226006
 
12.3%
1 26210580
 
12.3%
9 342534
 
0.2%
3 338019
 
0.2%
c 334854
 
0.2%
d 330001
 
0.2%
Other values (6) 134717
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 158703053
74.7%
Lowercase Letter 53808155
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78943886
49.7%
7 26550095
 
16.7%
4 26226006
 
16.5%
1 26210580
 
16.5%
9 342534
 
0.2%
3 338019
 
0.2%
0 34103
 
< 0.1%
6 30713
 
< 0.1%
2 22533
 
< 0.1%
8 4584
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26862261
49.9%
b 26238255
48.8%
c 334854
 
0.6%
d 330001
 
0.6%
e 27758
 
0.1%
f 15026
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 158703053
74.7%
Latin 53808155
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78943886
49.7%
7 26550095
 
16.7%
4 26226006
 
16.5%
1 26210580
 
16.5%
9 342534
 
0.2%
3 338019
 
0.2%
0 34103
 
< 0.1%
6 30713
 
< 0.1%
2 22533
 
< 0.1%
8 4584
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26862261
49.9%
b 26238255
48.8%
c 334854
 
0.6%
d 330001
 
0.6%
e 27758
 
0.1%
f 15026
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78943886
37.1%
a 26862261
 
12.6%
7 26550095
 
12.5%
b 26238255
 
12.3%
4 26226006
 
12.3%
1 26210580
 
12.3%
9 342534
 
0.2%
3 338019
 
0.2%
c 334854
 
0.2%
d 330001
 
0.2%
Other values (6) 134717
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct243
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.266595708
Minimum0
Maximum242
Zeros5951316
Zeros (%)22.4%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:20.414881image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile14
Maximum242
Range242
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.91383364
Coefficient of variation (CV)1.386077811
Kurtosis154.5250751
Mean4.266595708
Median Absolute Deviation (MAD)2
Skewness7.231703196
Sum113337426
Variance34.97342832
MonotonicityNot monotonic
2024-02-13T20:47:20.574715image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5951316
22.4%
1 4165284
15.7%
2 3113338
11.7%
3 2451715
9.2%
4 1986648
 
7.5%
5 1623299
 
6.1%
6 1325783
 
5.0%
7 1085320
 
4.1%
8 888555
 
3.3%
9 726582
 
2.7%
Other values (233) 3246061
12.2%
ValueCountFrequency (%)
0 5951316
22.4%
1 4165284
15.7%
2 3113338
11.7%
3 2451715
9.2%
4 1986648
 
7.5%
ValueCountFrequency (%)
242 36
< 0.1%
241 24
< 0.1%
240 15
< 0.1%
239 15
< 0.1%
238 15
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.65429483
Minimum0
Maximum35
Zeros1059354
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:20.716440image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum35
Range35
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.422602428
Coefficient of variation (CV)0.6900834168
Kurtosis-0.7283918749
Mean13.65429483
Median Absolute Deviation (MAD)7
Skewness0.4660985163
Sum362711336
Variance88.78543652
MonotonicityNot monotonic
2024-02-13T20:47:20.863872image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 1059354
 
4.0%
1 1059203
 
4.0%
2 1059196
 
4.0%
7 1059194
 
4.0%
11 1059194
 
4.0%
3 1059194
 
4.0%
8 1059194
 
4.0%
10 1059194
 
4.0%
6 1059194
 
4.0%
5 1059194
 
4.0%
Other values (26) 15971790
60.1%
ValueCountFrequency (%)
0 1059354
4.0%
1 1059203
4.0%
2 1059196
4.0%
3 1059194
4.0%
4 1059194
4.0%
ValueCountFrequency (%)
35 349802
1.3%
34 349802
1.3%
33 349802
1.3%
32 349802
1.3%
31 349803
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3823
Distinct (%)0.1%
Missing21941140
Missing (%)82.6%
Infinite0
Infinite (%)0.0%
Mean11.86476134
Minimum0
Maximum4841
Zeros4352536
Zeros (%)16.4%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:21.020291image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4841
Range4841
Interquartile range (IQR)0

Descriptive statistics

Standard deviation136.3209385
Coefficient of variation (CV)11.48956431
Kurtosis353.8199747
Mean11.86476134
Median Absolute Deviation (MAD)0
Skewness17.05194512
Sum54847956
Variance18583.39827
MonotonicityNot monotonic
2024-02-13T20:47:21.180766image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4352536
 
16.4%
1 44043
 
0.2%
2 16427
 
0.1%
3 16077
 
0.1%
4 14441
 
0.1%
7 9387
 
< 0.1%
5 8627
 
< 0.1%
6 7885
 
< 0.1%
10 6602
 
< 0.1%
8 6273
 
< 0.1%
Other values (3813) 140463
 
0.5%
(Missing) 21941140
82.6%
ValueCountFrequency (%)
0 4352536
16.4%
1 44043
 
0.2%
2 16427
 
0.1%
3 16077
 
0.1%
4 14441
 
0.1%
ValueCountFrequency (%)
4841 1
< 0.1%
4816 1
< 0.1%
4797 1
< 0.1%
4791 1
< 0.1%
4788 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4088
Distinct (%)< 0.1%
Missing15400316
Missing (%)58.0%
Infinite0
Infinite (%)0.0%
Mean53.50328689
Minimum-18
Maximum84575
Zeros9369638
Zeros (%)35.3%
Negative1862
Negative (%)< 0.1%
Memory size202.7 MiB
2024-02-13T20:47:21.345653image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-18
5-th percentile0
Q10
median0
Q30
95-th percentile324
Maximum84575
Range84593
Interquartile range (IQR)0

Descriptive statistics

Standard deviation277.0201669
Coefficient of variation (CV)5.177628947
Kurtosis14024.54881
Mean53.50328689
Median Absolute Deviation (MAD)0
Skewness51.3707085
Sum597288491
Variance76740.1729
MonotonicityNot monotonic
2024-02-13T20:47:21.508958image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9369638
35.3%
1 268619
 
1.0%
3 68817
 
0.3%
2 64720
 
0.2%
4 56309
 
0.2%
6 47258
 
0.2%
5 37046
 
0.1%
7 36609
 
0.1%
9 25836
 
0.1%
8 25368
 
0.1%
Other values (4078) 1163365
 
4.4%
(Missing) 15400316
58.0%
ValueCountFrequency (%)
-18 11
< 0.1%
-13 1
 
< 0.1%
-12 8
< 0.1%
-11 3
 
< 0.1%
-9 18
< 0.1%
ValueCountFrequency (%)
84575 3
< 0.1%
84574 3
< 0.1%
84561 2
< 0.1%
84560 3
< 0.1%
84533 1
 
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing18102749
Missing (%)68.1%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:21.635828image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052734
Coefficient of variation (CV)0.5310850359
Kurtosis-1.216783227
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum54997488
Variance11.91666808
MonotonicityNot monotonic
2024-02-13T20:47:21.763785image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 705096
 
2.7%
3 705096
 
2.7%
4 705096
 
2.7%
5 705096
 
2.7%
6 705096
 
2.7%
7 705096
 
2.7%
8 705096
 
2.7%
9 705096
 
2.7%
10 705096
 
2.7%
11 705096
 
2.7%
Other values (2) 1410192
 
5.3%
(Missing) 18102749
68.1%
ValueCountFrequency (%)
1 705096
2.7%
2 705096
2.7%
3 705096
2.7%
4 705096
2.7%
5 705096
2.7%
ValueCountFrequency (%)
12 705096
2.7%
11 705096
2.7%
10 705096
2.7%
9 705096
2.7%
8 705096
2.7%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing2878073
Missing (%)10.8%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:21.882799image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052602
Coefficient of variation (CV)0.5310850158
Kurtosis-1.21678322
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum153957882
Variance11.91666717
MonotonicityNot monotonic
2024-02-13T20:47:22.026614image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 1973819
7.4%
3 1973819
7.4%
4 1973819
7.4%
5 1973819
7.4%
6 1973819
7.4%
7 1973819
7.4%
8 1973819
7.4%
9 1973819
7.4%
10 1973819
7.4%
11 1973819
7.4%
Other values (2) 3947638
14.9%
(Missing) 2878073
10.8%
ValueCountFrequency (%)
1 1973819
7.4%
2 1973819
7.4%
3 1973819
7.4%
4 1973819
7.4%
5 1973819
7.4%
ValueCountFrequency (%)
12 1973819
7.4%
11 1973819
7.4%
10 1973819
7.4%
9 1973819
7.4%
8 1973819
7.4%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct195886
Distinct (%)4.2%
Missing21927579
Missing (%)82.5%
Infinite0
Infinite (%)0.0%
Mean1885.363037
Minimum0
Maximum70698930
Zeros4361805
Zeros (%)16.4%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:22.165619image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile607.2246
Maximum70698930
Range70698930
Interquartile range (IQR)0

Descriptive statistics

Standard deviation136944.0969
Coefficient of variation (CV)72.63539922
Kurtosis189296.3778
Mean1885.363037
Median Absolute Deviation (MAD)0
Skewness391.4570011
Sum8741150129
Variance1.875368568 × 1010
MonotonicityNot monotonic
2024-02-13T20:47:22.329126image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4361805
 
16.4%
400 359
 
< 0.1%
1000 357
 
< 0.1%
10 311
 
< 0.1%
2000 256
 
< 0.1%
0.2 227
 
< 0.1%
14 208
 
< 0.1%
2 207
 
< 0.1%
0.4 204
 
< 0.1%
4 203
 
< 0.1%
Other values (195876) 272185
 
1.0%
(Missing) 21927579
82.5%
ValueCountFrequency (%)
0 4361805
16.4%
0.002 36
 
< 0.1%
0.004 13
 
< 0.1%
0.006 10
 
< 0.1%
0.008 16
 
< 0.1%
ValueCountFrequency (%)
70698930 1
 
< 0.1%
70695176 11
< 0.1%
31050000 1
 
< 0.1%
30950000 1
 
< 0.1%
30650000 1
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct753948
Distinct (%)6.7%
Missing15388889
Missing (%)57.9%
Infinite0
Infinite (%)0.0%
Mean4051.316657
Minimum0
Maximum147928240
Zeros9299046
Zeros (%)35.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:22.494986image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15003.2657
Maximum147928240
Range147928240
Interquartile range (IQR)0

Descriptive statistics

Standard deviation121986.7208
Coefficient of variation (CV)30.11038907
Kurtosis493429.9297
Mean4051.316657
Median Absolute Deviation (MAD)0
Skewness567.0543363
Sum4.527351225 × 1010
Variance1.488076005 × 1010
MonotonicityNot monotonic
2024-02-13T20:47:22.659053image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9299046
35.0%
0.2 6325
 
< 0.1%
1000 3624
 
< 0.1%
0.4 2586
 
< 0.1%
2000 2462
 
< 0.1%
3000 2147
 
< 0.1%
0.8 1842
 
< 0.1%
0.6 1794
 
< 0.1%
1.2 1697
 
< 0.1%
2 1684
 
< 0.1%
Other values (753938) 1851805
 
7.0%
(Missing) 15388889
57.9%
ValueCountFrequency (%)
0 9299046
35.0%
0.002 168
 
< 0.1%
0.004 115
 
< 0.1%
0.006 96
 
< 0.1%
0.008 56
 
< 0.1%
ValueCountFrequency (%)
147928240 2
 
< 0.1%
58727596 12
< 0.1%
58690590 2
 
< 0.1%
58640588 5
< 0.1%
45464460 2
 
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct9
Distinct (%)< 0.1%
Missing18102749
Missing (%)68.1%
Infinite0
Infinite (%)0.0%
Mean2018.407061
Minimum2008
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:22.801100image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2008
5-th percentile2017
Q12018
median2019
Q32019
95-th percentile2019
Maximum2020
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7990894875
Coefficient of variation (CV)0.0003959010563
Kurtosis-0.6013368328
Mean2018.407061
Median Absolute Deviation (MAD)1
Skewness-0.3825436167
Sum1.707804894 × 1010
Variance0.638544009
MonotonicityNot monotonic
2024-02-13T20:47:22.924652image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2019 4075761
 
15.3%
2018 2706914
 
10.2%
2017 1328982
 
5.0%
2020 349151
 
1.3%
2016 309
 
< 0.1%
2009 12
 
< 0.1%
2008 11
 
< 0.1%
2015 11
 
< 0.1%
2010 1
 
< 0.1%
(Missing) 18102749
68.1%
ValueCountFrequency (%)
2008 11
 
< 0.1%
2009 12
 
< 0.1%
2010 1
 
< 0.1%
2015 11
 
< 0.1%
2016 309
< 0.1%
ValueCountFrequency (%)
2020 349151
 
1.3%
2019 4075761
15.3%
2018 2706914
10.2%
2017 1328982
 
5.0%
2016 309
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct21
Distinct (%)< 0.1%
Missing2878073
Missing (%)10.8%
Infinite0
Infinite (%)0.0%
Mean2014.051273
Minimum2000
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:23.050651image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2007
Q12011
median2015
Q32017
95-th percentile2019
Maximum2020
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.870855187
Coefficient of variation (CV)0.001921924849
Kurtosis-0.6500247174
Mean2014.051273
Median Absolute Deviation (MAD)3
Skewness-0.639709729
Sum4.770447203 × 1010
Variance14.98351988
MonotonicityNot monotonic
2024-02-13T20:47:23.190659image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2018 3393439
12.8%
2017 3063082
11.5%
2016 2311533
8.7%
2015 2055410
7.7%
2014 1906316
7.2%
2019 1757830
6.6%
2013 1647605
6.2%
2012 1352888
 
5.1%
2011 1170407
 
4.4%
2008 983557
 
3.7%
Other values (11) 4043761
15.2%
(Missing) 2878073
10.8%
ValueCountFrequency (%)
2000 11
 
< 0.1%
2001 166
 
< 0.1%
2002 675
 
< 0.1%
2003 1996
 
< 0.1%
2004 57442
0.2%
ValueCountFrequency (%)
2020 133913
 
0.5%
2019 1757830
6.6%
2018 3393439
12.8%
2017 3063082
11.5%
2016 2311533
8.7%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:23.356153image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000000038
Min length8

Characters and Unicode

Total characters212511209
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25563490
96.2%
ab3c25cf 982428
 
3.7%
15f04f45 9803
 
< 0.1%
be4fd70b 4917
 
< 0.1%
daf49a8a 3186
 
< 0.1%
71ddaa88 29
 
< 0.1%
0c42a10e 26
 
< 0.1%
1d94eac1 13
 
< 0.1%
652d52e3 8
 
< 0.1%
p28_48_88 1
 
< 0.1%
2024-02-13T20:47:23.674514image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 77692520
36.6%
b 26555752
 
12.5%
a 26555573
 
12.5%
4 25591239
 
12.0%
1 25573374
 
12.0%
7 25568436
 
12.0%
c 1964895
 
0.9%
f 1010137
 
0.5%
2 982471
 
0.5%
3 982436
 
0.5%
Other values (8) 34376
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156411703
73.6%
Lowercase Letter 56099503
 
26.4%
Connector Punctuation 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 77692520
49.7%
4 25591239
 
16.4%
1 25573374
 
16.4%
7 25568436
 
16.3%
2 982471
 
0.6%
3 982436
 
0.6%
0 14772
 
< 0.1%
8 3248
 
< 0.1%
9 3199
 
< 0.1%
6 8
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 26555752
47.3%
a 26555573
47.3%
c 1964895
 
3.5%
f 1010137
 
1.8%
d 8182
 
< 0.1%
e 4964
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 156411705
73.6%
Latin 56099504
 
26.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 77692520
49.7%
4 25591239
 
16.4%
1 25573374
 
16.4%
7 25568436
 
16.3%
2 982471
 
0.6%
3 982436
 
0.6%
0 14772
 
< 0.1%
8 3248
 
< 0.1%
9 3199
 
< 0.1%
6 8
 
< 0.1%
Latin
ValueCountFrequency (%)
b 26555752
47.3%
a 26555573
47.3%
c 1964895
 
3.5%
f 1010137
 
1.8%
d 8182
 
< 0.1%
e 4964
 
< 0.1%
P 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 77692520
36.6%
b 26555752
 
12.5%
a 26555573
 
12.5%
4 25591239
 
12.0%
1 25573374
 
12.0%
7 25568436
 
12.0%
c 1964895
 
0.9%
f 1010137
 
0.5%
2 982471
 
0.5%
3 982436
 
0.5%
Other values (8) 34376
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size202.7 MiB
2024-02-13T20:47:23.830106image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000020592
Min length8

Characters and Unicode

Total characters212511755
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26212493
98.7%
ab3c25cf 342698
 
1.3%
be4fd70b 3408
 
< 0.1%
daf49a8a 2405
 
< 0.1%
15f04f45 2339
 
< 0.1%
p28_48_88 547
 
< 0.1%
0c42a10e 9
 
< 0.1%
71ddaa88 2
 
< 0.1%
2024-02-13T20:47:24.115937image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78984855
37.2%
a 26562419
 
12.5%
b 26562007
 
12.5%
4 26223540
 
12.3%
7 26215903
 
12.3%
1 26214843
 
12.3%
c 685405
 
0.3%
f 353189
 
0.2%
2 343254
 
0.2%
3 342698
 
0.2%
Other values (7) 23642
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 158337860
74.5%
Lowercase Letter 54172254
 
25.5%
Connector Punctuation 1094
 
< 0.1%
Uppercase Letter 547
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78984855
49.9%
4 26223540
 
16.6%
7 26215903
 
16.6%
1 26214843
 
16.6%
2 343254
 
0.2%
3 342698
 
0.2%
0 5765
 
< 0.1%
8 4597
 
< 0.1%
9 2405
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26562419
49.0%
b 26562007
49.0%
c 685405
 
1.3%
f 353189
 
0.7%
d 5817
 
< 0.1%
e 3417
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1094
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 547
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 158338954
74.5%
Latin 54172801
 
25.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78984855
49.9%
4 26223540
 
16.6%
7 26215903
 
16.6%
1 26214843
 
16.6%
2 343254
 
0.2%
3 342698
 
0.2%
0 5765
 
< 0.1%
8 4597
 
< 0.1%
9 2405
 
< 0.1%
_ 1094
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26562419
49.0%
b 26562007
49.0%
c 685405
 
1.3%
f 353189
 
0.7%
d 5817
 
< 0.1%
e 3417
 
< 0.1%
P 547
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212511755
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78984855
37.2%
a 26562419
 
12.5%
b 26562007
 
12.5%
4 26223540
 
12.3%
7 26215903
 
12.3%
1 26214843
 
12.3%
c 685405
 
0.3%
f 353189
 
0.2%
2 343254
 
0.2%
3 342698
 
0.2%
Other values (7) 23642
 
< 0.1%