Overview

Dataset statistics

Number of variables19
Number of observations18723227
Missing cells119399400
Missing cells (%)33.6%
Total size in memory2.7 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 18526243 (98.9%) missing valuesMissing
collater_valueofguarantee_876L has 17972743 (96.0%) missing valuesMissing
pmts_dpd_1073P has 15922857 (85.0%) missing valuesMissing
pmts_dpd_303P has 10412923 (55.6%) missing valuesMissing
pmts_month_158T has 13871207 (74.1%) missing valuesMissing
pmts_month_706T has 1255931 (6.7%) missing valuesMissing
pmts_overdue_1140A has 15907992 (85.0%) missing valuesMissing
pmts_overdue_1152A has 10402366 (55.6%) missing valuesMissing
pmts_year_1139T has 13871207 (74.1%) missing valuesMissing
pmts_year_507T has 1255931 (6.7%) missing valuesMissing
collater_valueofguarantee_876L is highly skewed (γ1 = 54.73401012)Skewed
pmts_dpd_1073P is highly skewed (γ1 = 28.43925387)Skewed
pmts_dpd_303P is highly skewed (γ1 = 34.75300003)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 389.9098383)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 700.641166)Skewed
collater_valueofguarantee_876L has 684497 (3.7%) zerosZeros
num_group1 has 3284738 (17.5%) zerosZeros
num_group2 has 756865 (4.0%) zerosZeros
pmts_dpd_1073P has 2670658 (14.3%) zerosZeros
pmts_dpd_303P has 7004364 (37.4%) zerosZeros
pmts_overdue_1140A has 2682948 (14.3%) zerosZeros
pmts_overdue_1152A has 6962787 (37.2%) zerosZeros

Reproduction

Analysis started2024-02-13 19:52:20.552408
Analysis finished2024-02-13 19:52:48.361400
Duration27.81 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct103033
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1557410.562
Minimum53716
Maximum2700533
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:48.522400image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum53716
5-th percentile241804
Q11008820
median1899366
Q31923388
95-th percentile2695838
Maximum2700533
Range2646817
Interquartile range (IQR)914568

Descriptive statistics

Standard deviation798141.0648
Coefficient of variation (CV)0.5124795505
Kurtosis-0.8108124622
Mean1557410.562
Median Absolute Deviation (MAD)30350
Skewness-0.5993108845
Sum2.915975148 × 1013
Variance6.370291593 × 1011
MonotonicityIncreasing
2024-02-13T20:52:48.708513image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
245030 2787
 
< 0.1%
252165 2172
 
< 0.1%
1022291 2100
 
< 0.1%
1019972 1860
 
< 0.1%
1927556 1752
 
< 0.1%
1887379 1728
 
< 0.1%
252127 1692
 
< 0.1%
1930975 1668
 
< 0.1%
1900015 1668
 
< 0.1%
1895121 1536
 
< 0.1%
Other values (103023) 18704264
99.9%
ValueCountFrequency (%)
53716 120
< 0.1%
53785 72
 
< 0.1%
53786 60
 
< 0.1%
53788 204
< 0.1%
53789 96
< 0.1%
ValueCountFrequency (%)
2700533 264
< 0.1%
2700532 144
 
< 0.1%
2700531 528
< 0.1%
2700530 360
< 0.1%
2700529 48
 
< 0.1%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:48.877477image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters149785816
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 18526243
98.9%
9a0c095e 133066
 
0.7%
8fd95e4b 63825
 
0.3%
06fb9ba8 83
 
< 0.1%
26cf31be 10
 
< 0.1%
2024-02-13T20:52:49.161012image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 55775620
37.2%
a 18659392
 
12.5%
b 18590244
 
12.4%
4 18590068
 
12.4%
1 18526253
 
12.4%
7 18526243
 
12.4%
9 330040
 
0.2%
0 266215
 
0.2%
e 196901
 
0.1%
c 133076
 
0.1%
Other values (6) 191764
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 112078460
74.8%
Lowercase Letter 37707356
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 55775620
49.8%
4 18590068
 
16.6%
1 18526253
 
16.5%
7 18526243
 
16.5%
9 330040
 
0.3%
0 266215
 
0.2%
8 63908
 
0.1%
6 93
 
< 0.1%
2 10
 
< 0.1%
3 10
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 18659392
49.5%
b 18590244
49.3%
e 196901
 
0.5%
c 133076
 
0.4%
f 63918
 
0.2%
d 63825
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 112078460
74.8%
Latin 37707356
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 55775620
49.8%
4 18590068
 
16.6%
1 18526253
 
16.5%
7 18526243
 
16.5%
9 330040
 
0.3%
0 266215
 
0.2%
8 63908
 
0.1%
6 93
 
< 0.1%
2 10
 
< 0.1%
3 10
 
< 0.1%
Latin
ValueCountFrequency (%)
a 18659392
49.5%
b 18590244
49.3%
e 196901
 
0.5%
c 133076
 
0.4%
f 63918
 
0.2%
d 63825
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149785816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 55775620
37.2%
a 18659392
 
12.5%
b 18590244
 
12.4%
4 18590068
 
12.4%
1 18526253
 
12.4%
7 18526243
 
12.4%
9 330040
 
0.2%
0 266215
 
0.2%
e 196901
 
0.1%
c 133076
 
0.1%
Other values (6) 191764
 
0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:49.320585image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters149785816
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17972743
96.0%
9a0c095e 404065
 
2.2%
8fd95e4b 345159
 
1.8%
06fb9ba8 1107
 
< 0.1%
3cbe86ba 146
 
< 0.1%
c7a5ad39 6
 
< 0.1%
9276e4bb 1
 
< 0.1%
2024-02-13T20:52:49.610617image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 54667459
36.5%
a 18378073
 
12.3%
b 18320410
 
12.2%
4 18317903
 
12.2%
7 17972750
 
12.0%
1 17972743
 
12.0%
9 1154403
 
0.8%
0 809237
 
0.5%
e 749371
 
0.5%
c 404217
 
0.3%
Other values (6) 1039250
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 111242314
74.3%
Lowercase Letter 38543502
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 54667459
49.1%
4 18317903
 
16.5%
7 17972750
 
16.2%
1 17972743
 
16.2%
9 1154403
 
1.0%
0 809237
 
0.7%
8 346412
 
0.3%
6 1254
 
< 0.1%
3 152
 
< 0.1%
2 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 18378073
47.7%
b 18320410
47.5%
e 749371
 
1.9%
c 404217
 
1.0%
f 346266
 
0.9%
d 345165
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 111242314
74.3%
Latin 38543502
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 54667459
49.1%
4 18317903
 
16.5%
7 17972750
 
16.2%
1 17972743
 
16.2%
9 1154403
 
1.0%
0 809237
 
0.7%
8 346412
 
0.3%
6 1254
 
< 0.1%
3 152
 
< 0.1%
2 1
 
< 0.1%
Latin
ValueCountFrequency (%)
a 18378073
47.7%
b 18320410
47.5%
e 749371
 
1.9%
c 404217
 
1.0%
f 346266
 
0.9%
d 345165
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149785816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 54667459
36.5%
a 18378073
 
12.3%
b 18320410
 
12.2%
4 18317903
 
12.2%
7 17972750
 
12.0%
1 17972743
 
12.0%
9 1154403
 
0.8%
0 809237
 
0.5%
e 749371
 
0.5%
c 404217
 
0.3%
Other values (6) 1039250
 
0.7%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING 

Distinct8501
Distinct (%)4.3%
Missing18526243
Missing (%)98.9%
Infinite0
Infinite (%)0.0%
Mean9448631.968
Minimum0
Maximum6027219225
Zeros181556
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:49.760631image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5373873.4
Maximum6027219225
Range6027219225
Interquartile range (IQR)0

Descriptive statistics

Standard deviation114320250.1
Coefficient of variation (CV)12.0991325
Kurtosis468.522471
Mean9448631.968
Median Absolute Deviation (MAD)0
Skewness18.81802309
Sum1.86122932 × 1012
Variance1.306911958 × 1016
MonotonicityNot monotonic
2024-02-13T20:52:50.031294image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 181556
 
1.0%
1450000000 530
 
< 0.1%
750000000 450
 
< 0.1%
60000000 174
 
< 0.1%
57484513 99
 
< 0.1%
248173000 99
 
< 0.1%
53007067 99
 
< 0.1%
54115247 99
 
< 0.1%
32469000 99
 
< 0.1%
40051622 99
 
< 0.1%
Other values (8491) 13680
 
0.1%
(Missing) 18526243
98.9%
ValueCountFrequency (%)
0 181556
1.0%
1 47
 
< 0.1%
100 1
 
< 0.1%
400 1
 
< 0.1%
450 1
 
< 0.1%
ValueCountFrequency (%)
6027219225 3
 
< 0.1%
4832979108 1
 
< 0.1%
3200000000 2
 
< 0.1%
3063721973 99
< 0.1%
2083503000 1
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct21973
Distinct (%)2.9%
Missing17972743
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean1728788.501
Minimum0
Maximum5500000000
Zeros684497
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:50.220409image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile120000
Maximum5500000000
Range5500000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation62652574.11
Coefficient of variation (CV)36.24073972
Kurtosis3451.470303
Mean1728788.501
Median Absolute Deviation (MAD)0
Skewness54.73401012
Sum1.29742811 × 1012
Variance3.925345043 × 1015
MonotonicityNot monotonic
2024-02-13T20:52:50.509942image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 684497
 
3.7%
60000 1994
 
< 0.1%
130000 1474
 
< 0.1%
100000 1454
 
< 0.1%
50000 1039
 
< 0.1%
65000 785
 
< 0.1%
200000 718
 
< 0.1%
300000 685
 
< 0.1%
150000 654
 
< 0.1%
70000 639
 
< 0.1%
Other values (21963) 56545
 
0.3%
(Missing) 17972743
96.0%
ValueCountFrequency (%)
0 684497
3.7%
0.02 9
 
< 0.1%
0.03 23
 
< 0.1%
0.04 5
 
< 0.1%
0.06 4
 
< 0.1%
ValueCountFrequency (%)
5500000000 9
 
< 0.1%
4500000000 49
< 0.1%
4275168566 1
 
< 0.1%
4275158566 2
 
< 0.1%
3500000000 4
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:50.690444image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters149785816
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17973072
96.0%
c7a5ad39 594340
 
3.2%
3cbe86ba 105770
 
0.6%
9276e4bb 15142
 
0.1%
0e63c0f0 13546
 
0.1%
168ad9f3 4713
 
< 0.1%
5224034a 3726
 
< 0.1%
940efad7 3263
 
< 0.1%
7b62420e 3208
 
< 0.1%
2fd21cf1 2329
 
< 0.1%
Other values (5) 4118
 
< 0.1%
2024-02-13T20:52:50.992678image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 54520000
36.4%
a 19282896
 
12.9%
7 18591541
 
12.4%
b 18219859
 
12.2%
4 18006803
 
12.0%
1 17983528
 
12.0%
3 723728
 
0.5%
c 718703
 
0.5%
9 620724
 
0.4%
d 605375
 
0.4%
Other values (6) 512659
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 110790028
74.0%
Lowercase Letter 38995788
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 54520000
49.2%
7 18591541
 
16.8%
4 18006803
 
16.3%
1 17983528
 
16.2%
3 723728
 
0.7%
9 620724
 
0.6%
6 144134
 
0.1%
8 112298
 
0.1%
0 52874
 
< 0.1%
2 34398
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 19282896
49.4%
b 18219859
46.7%
c 718703
 
1.8%
d 605375
 
1.6%
e 142014
 
0.4%
f 26941
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 110790028
74.0%
Latin 38995788
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 54520000
49.2%
7 18591541
 
16.8%
4 18006803
 
16.3%
1 17983528
 
16.2%
3 723728
 
0.7%
9 620724
 
0.6%
6 144134
 
0.1%
8 112298
 
0.1%
0 52874
 
< 0.1%
2 34398
 
< 0.1%
Latin
ValueCountFrequency (%)
a 19282896
49.4%
b 18219859
46.7%
c 718703
 
1.8%
d 605375
 
1.6%
e 142014
 
0.4%
f 26941
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149785816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 54520000
36.4%
a 19282896
 
12.9%
7 18591541
 
12.4%
b 18219859
 
12.2%
4 18006803
 
12.0%
1 17983528
 
12.0%
3 723728
 
0.5%
c 718703
 
0.5%
9 620724
 
0.4%
d 605375
 
0.4%
Other values (6) 512659
 
0.3%
Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:51.179181image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters149785816
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 18526316
98.9%
c7a5ad39 177182
 
0.9%
9276e4bb 8019
 
< 0.1%
0e63c0f0 5246
 
< 0.1%
168ad9f3 1853
 
< 0.1%
7b62420e 1728
 
< 0.1%
f4d8a027 1508
 
< 0.1%
3cbe86ba 442
 
< 0.1%
940efad7 426
 
< 0.1%
2fd21cf1 230
 
< 0.1%
Other values (4) 277
 
< 0.1%
2024-02-13T20:52:51.469092image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 55756406
37.2%
a 18885084
 
12.6%
7 18715283
 
12.5%
b 18545070
 
12.4%
4 18538344
 
12.4%
1 18528732
 
12.4%
9 187606
 
0.1%
3 184896
 
0.1%
c 183266
 
0.1%
d 181199
 
0.1%
Other values (6) 79930
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 111965740
74.8%
Lowercase Letter 37820076
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 55756406
49.8%
7 18715283
 
16.7%
4 18538344
 
16.6%
1 18528732
 
16.5%
9 187606
 
0.2%
3 184896
 
0.2%
0 19512
 
< 0.1%
6 17392
 
< 0.1%
2 13663
 
< 0.1%
8 3906
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 18885084
49.9%
b 18545070
49.0%
c 183266
 
0.5%
d 181199
 
0.5%
e 15964
 
< 0.1%
f 9493
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 111965740
74.8%
Latin 37820076
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 55756406
49.8%
7 18715283
 
16.7%
4 18538344
 
16.6%
1 18528732
 
16.5%
9 187606
 
0.2%
3 184896
 
0.2%
0 19512
 
< 0.1%
6 17392
 
< 0.1%
2 13663
 
< 0.1%
8 3906
 
< 0.1%
Latin
ValueCountFrequency (%)
a 18885084
49.9%
b 18545070
49.0%
c 183266
 
0.5%
d 181199
 
0.5%
e 15964
 
< 0.1%
f 9493
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149785816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 55756406
37.2%
a 18885084
 
12.6%
7 18715283
 
12.5%
b 18545070
 
12.4%
4 18538344
 
12.4%
1 18528732
 
12.4%
9 187606
 
0.1%
3 184896
 
0.1%
c 183266
 
0.1%
d 181199
 
0.1%
Other values (6) 79930
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct153
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.310656384
Minimum0
Maximum152
Zeros3284738
Zeros (%)17.5%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:51.623337image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q37
95-th percentile17
Maximum152
Range152
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.70219662
Coefficient of variation (CV)1.262027918
Kurtosis31.59048505
Mean5.310656384
Median Absolute Deviation (MAD)3
Skewness3.862506627
Sum99432625
Variance44.91943954
MonotonicityNot monotonic
2024-02-13T20:52:51.779953image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3284738
17.5%
1 2557151
13.7%
2 2060067
11.0%
3 1713951
9.2%
4 1447605
7.7%
5 1228497
 
6.6%
6 1038601
 
5.5%
7 874735
 
4.7%
8 735992
 
3.9%
9 612795
 
3.3%
Other values (143) 3169095
16.9%
ValueCountFrequency (%)
0 3284738
17.5%
1 2557151
13.7%
2 2060067
11.0%
3 1713951
9.2%
4 1447605
7.7%
ValueCountFrequency (%)
152 12
 
< 0.1%
151 12
 
< 0.1%
150 12
 
< 0.1%
149 12
 
< 0.1%
148 36
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.49925245
Minimum0
Maximum35
Zeros756865
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:51.925281image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum35
Range35
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.359253382
Coefficient of variation (CV)0.6933164199
Kurtosis-0.6953470677
Mean13.49925245
Median Absolute Deviation (MAD)7
Skewness0.4833000703
Sum252749568
Variance87.59562387
MonotonicityNot monotonic
2024-02-13T20:52:52.093066image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 756865
 
4.0%
6 756800
 
4.0%
11 756800
 
4.0%
10 756800
 
4.0%
8 756800
 
4.0%
7 756800
 
4.0%
9 756800
 
4.0%
5 756800
 
4.0%
4 756800
 
4.0%
3 756800
 
4.0%
Other values (26) 11155162
59.6%
ValueCountFrequency (%)
0 756865
4.0%
1 756800
4.0%
2 756800
4.0%
3 756800
4.0%
4 756800
4.0%
ValueCountFrequency (%)
35 236606
1.3%
34 236606
1.3%
33 236606
1.3%
32 236606
1.3%
31 236606
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct2782
Distinct (%)0.1%
Missing15922857
Missing (%)85.0%
Infinite0
Infinite (%)0.0%
Mean4.420827248
Minimum0
Maximum4212
Zeros2670658
Zeros (%)14.3%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:52.254070image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4212
Range4212
Interquartile range (IQR)0

Descriptive statistics

Standard deviation80.87349048
Coefficient of variation (CV)18.29374593
Kurtosis959.9449549
Mean4.420827248
Median Absolute Deviation (MAD)0
Skewness28.43925387
Sum12379952
Variance6540.521462
MonotonicityNot monotonic
2024-02-13T20:52:52.400004image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2670658
 
14.3%
1 24013
 
0.1%
2 10062
 
0.1%
3 8241
 
< 0.1%
4 7341
 
< 0.1%
7 4717
 
< 0.1%
5 4617
 
< 0.1%
6 4166
 
< 0.1%
8 4001
 
< 0.1%
10 3528
 
< 0.1%
Other values (2772) 59026
 
0.3%
(Missing) 15922857
85.0%
ValueCountFrequency (%)
0 2670658
14.3%
1 24013
 
0.1%
2 10062
 
0.1%
3 8241
 
< 0.1%
4 7341
 
< 0.1%
ValueCountFrequency (%)
4212 1
< 0.1%
4187 1
< 0.1%
4162 1
< 0.1%
4137 1
< 0.1%
4104 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4217
Distinct (%)0.1%
Missing10412923
Missing (%)55.6%
Infinite0
Infinite (%)0.0%
Mean56.20145629
Minimum-30
Maximum84575
Zeros7004364
Zeros (%)37.4%
Negative1057
Negative (%)< 0.1%
Memory size142.8 MiB
2024-02-13T20:52:52.552079image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-30
5-th percentile0
Q10
median0
Q30
95-th percentile337
Maximum84575
Range84605
Interquartile range (IQR)0

Descriptive statistics

Standard deviation282.9937068
Coefficient of variation (CV)5.035344731
Kurtosis8649.934629
Mean56.20145629
Median Absolute Deviation (MAD)0
Skewness34.75300003
Sum467051187
Variance80085.4381
MonotonicityNot monotonic
2024-02-13T20:52:52.704816image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7004364
37.4%
1 188161
 
1.0%
3 47792
 
0.3%
2 45809
 
0.2%
4 39322
 
0.2%
6 31533
 
0.2%
5 26247
 
0.1%
7 25690
 
0.1%
9 19059
 
0.1%
8 18694
 
0.1%
Other values (4207) 863633
 
4.6%
(Missing) 10412923
55.6%
ValueCountFrequency (%)
-30 1
 
< 0.1%
-15 1
 
< 0.1%
-10 4
< 0.1%
-9 1
 
< 0.1%
-8 2
< 0.1%
ValueCountFrequency (%)
84575 1
< 0.1%
84574 1
< 0.1%
84561 1
< 0.1%
84560 1
< 0.1%
84533 1
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing13871207
Missing (%)74.1%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:52.826941image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052885
Coefficient of variation (CV)0.5310850593
Kurtosis-1.216783234
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum31538130
Variance11.91666912
MonotonicityNot monotonic
2024-02-13T20:52:52.941526image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 404335
 
2.2%
3 404335
 
2.2%
4 404335
 
2.2%
5 404335
 
2.2%
6 404335
 
2.2%
7 404335
 
2.2%
8 404335
 
2.2%
9 404335
 
2.2%
10 404335
 
2.2%
11 404335
 
2.2%
Other values (2) 808670
 
4.3%
(Missing) 13871207
74.1%
ValueCountFrequency (%)
1 404335
2.2%
2 404335
2.2%
3 404335
2.2%
4 404335
2.2%
5 404335
2.2%
ValueCountFrequency (%)
12 404335
2.2%
11 404335
2.2%
10 404335
2.2%
9 404335
2.2%
8 404335
2.2%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing1255931
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:53.057496image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052628
Coefficient of variation (CV)0.5310850197
Kurtosis-1.216783222
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum113537424
Variance11.91666735
MonotonicityNot monotonic
2024-02-13T20:52:53.181526image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 1455608
7.8%
3 1455608
7.8%
4 1455608
7.8%
5 1455608
7.8%
6 1455608
7.8%
7 1455608
7.8%
8 1455608
7.8%
9 1455608
7.8%
10 1455608
7.8%
11 1455608
7.8%
Other values (2) 2911216
15.5%
ValueCountFrequency (%)
1 1455608
7.8%
2 1455608
7.8%
3 1455608
7.8%
4 1455608
7.8%
5 1455608
7.8%
ValueCountFrequency (%)
12 1455608
7.8%
11 1455608
7.8%
10 1455608
7.8%
9 1455608
7.8%
8 1455608
7.8%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct101237
Distinct (%)3.6%
Missing15907992
Missing (%)85.0%
Infinite0
Infinite (%)0.0%
Mean4212.204303
Minimum0
Maximum437339700
Zeros2682948
Zeros (%)14.3%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:53.328209image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum437339700
Range437339700
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1030260.598
Coefficient of variation (CV)244.5894177
Kurtosis158071.8089
Mean4212.204303
Median Absolute Deviation (MAD)0
Skewness389.9098383
Sum1.185834498 × 1010
Variance1.061436899 × 1012
MonotonicityNot monotonic
2024-02-13T20:52:53.478792image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2682948
 
14.3%
10 255
 
< 0.1%
99.8 218
 
< 0.1%
14 217
 
< 0.1%
400 164
 
< 0.1%
0.2 158
 
< 0.1%
4 119
 
< 0.1%
20 116
 
< 0.1%
2 114
 
< 0.1%
8 104
 
< 0.1%
Other values (101227) 130822
 
0.7%
(Missing) 15907992
85.0%
ValueCountFrequency (%)
0 2682948
14.3%
0.002 12
 
< 0.1%
0.004 14
 
< 0.1%
0.006 8
 
< 0.1%
0.008 31
 
< 0.1%
ValueCountFrequency (%)
437339700 2
< 0.1%
432173570 2
< 0.1%
427007420 4
< 0.1%
416675140 2
< 0.1%
411508960 2
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct573705
Distinct (%)6.9%
Missing10402366
Missing (%)55.6%
Infinite0
Infinite (%)0.0%
Mean3737.791521
Minimum0
Maximum120690770
Zeros6962787
Zeros (%)37.2%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:53.637145image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile14856.078
Maximum120690770
Range120690770
Interquartile range (IQR)0

Descriptive statistics

Standard deviation71009.10589
Coefficient of variation (CV)18.99761008
Kurtosis1032879.158
Mean3737.791521
Median Absolute Deviation (MAD)0
Skewness700.641166
Sum3.110164369 × 1010
Variance5042293119
MonotonicityNot monotonic
2024-02-13T20:52:53.798240image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6962787
37.2%
0.2 3820
 
< 0.1%
1000 2029
 
< 0.1%
0.4 1778
 
< 0.1%
0.8 1408
 
< 0.1%
2000 1332
 
< 0.1%
1.6 1296
 
< 0.1%
0.6 1244
 
< 0.1%
4 1242
 
< 0.1%
2 1190
 
< 0.1%
Other values (573695) 1342735
 
7.2%
(Missing) 10402366
55.6%
ValueCountFrequency (%)
0 6962787
37.2%
0.002 135
 
< 0.1%
0.004 75
 
< 0.1%
0.006 58
 
< 0.1%
0.008 28
 
< 0.1%
ValueCountFrequency (%)
120690770 1
< 0.1%
28935590 1
< 0.1%
27888398 2
< 0.1%
23032604 1
< 0.1%
21444070 2
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing13871207
Missing (%)74.1%
Infinite0
Infinite (%)0.0%
Mean2019.377584
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:53.926273image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2018
Q12019
median2019
Q32020
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7969793792
Coefficient of variation (CV)0.0003946658542
Kurtosis-0.6982152095
Mean2019.377584
Median Absolute Deviation (MAD)1
Skewness-0.3187228351
Sum9798060427
Variance0.6351761309
MonotonicityNot monotonic
2024-02-13T20:52:54.039193image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 2232716
 
11.9%
2019 1649132
 
8.8%
2018 779995
 
4.2%
2021 189931
 
1.0%
2017 202
 
< 0.1%
2016 44
 
< 0.1%
(Missing) 13871207
74.1%
ValueCountFrequency (%)
2016 44
 
< 0.1%
2017 202
 
< 0.1%
2018 779995
 
4.2%
2019 1649132
8.8%
2020 2232716
11.9%
ValueCountFrequency (%)
2021 189931
 
1.0%
2020 2232716
11.9%
2019 1649132
8.8%
2018 779995
 
4.2%
2017 202
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct25
Distinct (%)< 0.1%
Missing1255931
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean2015.012677
Minimum2001
Maximum2028
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:54.162323image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2001
5-th percentile2007
Q12012
median2016
Q32018
95-th percentile2020
Maximum2028
Range27
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.045820772
Coefficient of variation (CV)0.002007838868
Kurtosis-0.4456587926
Mean2015.012677
Median Absolute Deviation (MAD)3
Skewness-0.7661366954
Sum3.519682287 × 1010
Variance16.36866572
MonotonicityNot monotonic
2024-02-13T20:52:54.326596image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2019 2522404
13.5%
2018 2504744
13.4%
2017 1894527
10.1%
2016 1399783
7.5%
2015 1244878
6.6%
2020 1230381
6.6%
2014 1146048
 
6.1%
2013 987335
 
5.3%
2012 806461
 
4.3%
2011 696986
 
3.7%
Other values (15) 3033749
16.2%
(Missing) 1255931
6.7%
ValueCountFrequency (%)
2001 33
 
< 0.1%
2002 333
 
< 0.1%
2003 1020
 
< 0.1%
2004 33189
 
0.2%
2005 166678
0.9%
ValueCountFrequency (%)
2028 1
 
< 0.1%
2027 12
 
< 0.1%
2026 12
 
< 0.1%
2025 11
 
< 0.1%
2021 92767
0.5%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:54.495293image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000000053
Min length8

Characters and Unicode

Total characters149785817
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17977415
96.0%
ab3c25cf 733318
 
3.9%
15f04f45 6501
 
< 0.1%
be4fd70b 3907
 
< 0.1%
daf49a8a 2058
 
< 0.1%
71ddaa88 12
 
< 0.1%
0c42a10e 9
 
< 0.1%
1d94eac1 6
 
< 0.1%
p28_48_88 1
 
< 0.1%
2024-02-13T20:52:54.786730image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 54678565
36.5%
b 18718547
 
12.5%
a 18716946
 
12.5%
4 17996398
 
12.0%
1 17983949
 
12.0%
7 17981334
 
12.0%
c 1466651
 
1.0%
f 752285
 
0.5%
2 733328
 
0.5%
3 733318
 
0.5%
Other values (7) 24496
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 110121468
73.5%
Lowercase Letter 39664346
 
26.5%
Connector Punctuation 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 54678565
49.7%
4 17996398
 
16.3%
1 17983949
 
16.3%
7 17981334
 
16.3%
2 733328
 
0.7%
3 733318
 
0.7%
0 10426
 
< 0.1%
8 2086
 
< 0.1%
9 2064
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 18718547
47.2%
a 18716946
47.2%
c 1466651
 
3.7%
f 752285
 
1.9%
d 5995
 
< 0.1%
e 3922
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 110121470
73.5%
Latin 39664347
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 54678565
49.7%
4 17996398
 
16.3%
1 17983949
 
16.3%
7 17981334
 
16.3%
2 733328
 
0.7%
3 733318
 
0.7%
0 10426
 
< 0.1%
8 2086
 
< 0.1%
9 2064
 
< 0.1%
_ 2
 
< 0.1%
Latin
ValueCountFrequency (%)
b 18718547
47.2%
a 18716946
47.2%
c 1466651
 
3.7%
f 752285
 
1.9%
d 5995
 
< 0.1%
e 3922
 
< 0.1%
P 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149785817
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 54678565
36.5%
b 18718547
 
12.5%
a 18716946
 
12.5%
4 17996398
 
12.0%
1 17983949
 
12.0%
7 17981334
 
12.0%
c 1466651
 
1.0%
f 752285
 
0.5%
2 733328
 
0.5%
3 733318
 
0.5%
Other values (7) 24496
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.8 MiB
2024-02-13T20:52:54.949722image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000015489
Min length8

Characters and Unicode

Total characters149786106
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 18532180
99.0%
ab3c25cf 186663
 
1.0%
be4fd70b 1869
 
< 0.1%
15f04f45 1136
 
< 0.1%
daf49a8a 1088
 
< 0.1%
p28_48_88 290
 
< 0.1%
71ddaa88 1
 
< 0.1%
2024-02-13T20:52:55.238612image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 55785475
37.2%
b 18722581
 
12.5%
a 18722109
 
12.5%
4 18537699
 
12.4%
7 18534050
 
12.4%
1 18533317
 
12.4%
c 373326
 
0.2%
f 191892
 
0.1%
2 186953
 
0.1%
3 186663
 
0.1%
Other values (7) 12041
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 111770500
74.6%
Lowercase Letter 38014736
 
25.4%
Connector Punctuation 580
 
< 0.1%
Uppercase Letter 290
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 55785475
49.9%
4 18537699
 
16.6%
7 18534050
 
16.6%
1 18533317
 
16.6%
2 186953
 
0.2%
3 186663
 
0.2%
0 3005
 
< 0.1%
8 2250
 
< 0.1%
9 1088
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 18722581
49.3%
a 18722109
49.2%
c 373326
 
1.0%
f 191892
 
0.5%
d 2959
 
< 0.1%
e 1869
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 580
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 290
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 111771080
74.6%
Latin 38015026
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 55785475
49.9%
4 18537699
 
16.6%
7 18534050
 
16.6%
1 18533317
 
16.6%
2 186953
 
0.2%
3 186663
 
0.2%
0 3005
 
< 0.1%
8 2250
 
< 0.1%
9 1088
 
< 0.1%
_ 580
 
< 0.1%
Latin
ValueCountFrequency (%)
b 18722581
49.3%
a 18722109
49.2%
c 373326
 
1.0%
f 191892
 
0.5%
d 2959
 
< 0.1%
e 1869
 
< 0.1%
P 290
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149786106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 55785475
37.2%
b 18722581
 
12.5%
a 18722109
 
12.5%
4 18537699
 
12.4%
7 18534050
 
12.4%
1 18533317
 
12.4%
c 373326
 
0.2%
f 191892
 
0.1%
2 186953
 
0.1%
3 186663
 
0.1%
Other values (7) 12041
 
< 0.1%