Overview

Dataset statistics

Number of variables19
Number of observations8055986
Missing cells51201435
Missing cells (%)33.5%
Total size in memory1.1 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 7956618 (98.8%) missing valuesMissing
collater_valueofguarantee_876L has 7720433 (95.8%) missing valuesMissing
pmts_dpd_1073P has 6871133 (85.3%) missing valuesMissing
pmts_dpd_303P has 4608761 (57.2%) missing valuesMissing
pmts_month_158T has 5585750 (69.3%) missing valuesMissing
pmts_month_706T has 702590 (8.7%) missing valuesMissing
pmts_overdue_1140A has 6863527 (85.2%) missing valuesMissing
pmts_overdue_1152A has 4604283 (57.2%) missing valuesMissing
pmts_year_1139T has 5585750 (69.3%) missing valuesMissing
pmts_year_507T has 702590 (8.7%) missing valuesMissing
collater_valueofguarantee_876L is highly skewed (γ1 = 46.93143835)Skewed
pmts_dpd_303P is highly skewed (γ1 = 36.80773901)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 344.9531162)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 259.3446397)Skewed
collater_valueofguarantee_1124L has 90662 (1.1%) zerosZeros
collater_valueofguarantee_876L has 293947 (3.6%) zerosZeros
num_group1 has 1462147 (18.1%) zerosZeros
num_group2 has 325360 (4.0%) zerosZeros
pmts_dpd_1073P has 1115719 (13.8%) zerosZeros
pmts_dpd_303P has 2884014 (35.8%) zerosZeros
pmts_overdue_1140A has 1122141 (13.9%) zerosZeros
pmts_overdue_1152A has 2864665 (35.6%) zerosZeros

Reproduction

Analysis started2024-02-13 19:51:08.860476
Analysis finished2024-02-13 19:51:28.115434
Duration19.25 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct45056
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1480036.192
Minimum49417
Maximum2681255
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:28.222469image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum49417
5-th percentile218775
Q1977210
median1825914
Q31836249
95-th percentile2678919
Maximum2681255
Range2631838
Interquartile range (IQR)859039

Descriptive statistics

Standard deviation747088.2698
Coefficient of variation (CV)0.5047770275
Kurtosis-0.6460760485
Mean1480036.192
Median Absolute Deviation (MAD)13367
Skewness-0.587070045
Sum1.192315085 × 1013
Variance5.581408829 × 1011
MonotonicityIncreasing
2024-02-13T20:51:28.491432image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
221467 7859
 
0.1%
973076 6764
 
0.1%
220977 4269
 
0.1%
1825173 2268
 
< 0.1%
219624 1800
 
< 0.1%
1835890 1776
 
< 0.1%
1836084 1776
 
< 0.1%
1831201 1656
 
< 0.1%
982012 1632
 
< 0.1%
1842598 1572
 
< 0.1%
Other values (45046) 8024614
99.6%
ValueCountFrequency (%)
49417 444
< 0.1%
49444 240
< 0.1%
49450 108
 
< 0.1%
49538 156
 
< 0.1%
49632 60
 
< 0.1%
ValueCountFrequency (%)
2681255 492
< 0.1%
2681252 84
 
< 0.1%
2681251 72
 
< 0.1%
2681250 132
 
< 0.1%
2681249 108
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:28.701439image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters64447888
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7956618
98.8%
9a0c095e 61356
 
0.8%
8fd95e4b 37968
 
0.5%
06fb9ba8 44
 
< 0.1%
2024-02-13T20:51:28.986014image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23969178
37.2%
a 8018018
 
12.4%
b 7994674
 
12.4%
4 7994586
 
12.4%
7 7956618
 
12.3%
1 7956618
 
12.3%
9 160724
 
0.2%
0 122756
 
0.2%
e 99324
 
0.2%
c 61356
 
0.1%
Other values (4) 114036
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48198536
74.8%
Lowercase Letter 16249352
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23969178
49.7%
4 7994586
 
16.6%
7 7956618
 
16.5%
1 7956618
 
16.5%
9 160724
 
0.3%
0 122756
 
0.3%
8 38012
 
0.1%
6 44
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 8018018
49.3%
b 7994674
49.2%
e 99324
 
0.6%
c 61356
 
0.4%
f 38012
 
0.2%
d 37968
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 48198536
74.8%
Latin 16249352
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23969178
49.7%
4 7994586
 
16.6%
7 7956618
 
16.5%
1 7956618
 
16.5%
9 160724
 
0.3%
0 122756
 
0.3%
8 38012
 
0.1%
6 44
 
< 0.1%
Latin
ValueCountFrequency (%)
a 8018018
49.3%
b 7994674
49.2%
e 99324
 
0.6%
c 61356
 
0.4%
f 38012
 
0.2%
d 37968
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23969178
37.2%
a 8018018
 
12.4%
b 7994674
 
12.4%
4 7994586
 
12.4%
7 7956618
 
12.3%
1 7956618
 
12.3%
9 160724
 
0.2%
0 122756
 
0.2%
e 99324
 
0.2%
c 61356
 
0.1%
Other values (4) 114036
 
0.2%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:29.156474image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters64447888
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7720433
95.8%
9a0c095e 172448
 
2.1%
8fd95e4b 162673
 
2.0%
06fb9ba8 386
 
< 0.1%
3cbe86ba 45
 
< 0.1%
9276e4bb 1
 
< 0.1%
2024-02-13T20:51:29.528512image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23496420
36.5%
a 7893312
 
12.2%
b 7883970
 
12.2%
4 7883107
 
12.2%
7 7720434
 
12.0%
1 7720433
 
12.0%
9 507956
 
0.8%
0 345282
 
0.5%
e 335167
 
0.5%
c 172493
 
0.3%
Other values (6) 489314
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47837214
74.2%
Lowercase Letter 16610674
 
25.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23496420
49.1%
4 7883107
 
16.5%
7 7720434
 
16.1%
1 7720433
 
16.1%
9 507956
 
1.1%
0 345282
 
0.7%
8 163104
 
0.3%
6 432
 
< 0.1%
3 45
 
< 0.1%
2 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 7893312
47.5%
b 7883970
47.5%
e 335167
 
2.0%
c 172493
 
1.0%
f 163059
 
1.0%
d 162673
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common 47837214
74.2%
Latin 16610674
 
25.8%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23496420
49.1%
4 7883107
 
16.5%
7 7720434
 
16.1%
1 7720433
 
16.1%
9 507956
 
1.1%
0 345282
 
0.7%
8 163104
 
0.3%
6 432
 
< 0.1%
3 45
 
< 0.1%
2 1
 
< 0.1%
Latin
ValueCountFrequency (%)
a 7893312
47.5%
b 7883970
47.5%
e 335167
 
2.0%
c 172493
 
1.0%
f 163059
 
1.0%
d 162673
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23496420
36.5%
a 7893312
 
12.2%
b 7883970
 
12.2%
4 7883107
 
12.2%
7 7720434
 
12.0%
1 7720433
 
12.0%
9 507956
 
0.8%
0 345282
 
0.5%
e 335167
 
0.5%
c 172493
 
0.3%
Other values (6) 489314
 
0.8%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  ZEROS 

Distinct4833
Distinct (%)4.9%
Missing7956618
Missing (%)98.8%
Infinite0
Infinite (%)0.0%
Mean9499369.01
Minimum0
Maximum1750000000
Zeros90662
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:29.695226image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile7351210
Maximum1750000000
Range1750000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation98302336.89
Coefficient of variation (CV)10.34830174
Kurtosis156.858693
Mean9499369.01
Median Absolute Deviation (MAD)0
Skewness12.42348167
Sum9.439332998 × 1011
Variance9.663349438 × 1015
MonotonicityNot monotonic
2024-02-13T20:51:29.885302image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 90662
 
1.1%
1300000000 504
 
< 0.1%
31000000 74
 
< 0.1%
938159252.8 72
 
< 0.1%
14536950 72
 
< 0.1%
10175865 72
 
< 0.1%
11108369.7 72
 
< 0.1%
456217535.2 72
 
< 0.1%
42669516.65 72
 
< 0.1%
59320800 72
 
< 0.1%
Other values (4823) 7624
 
0.1%
(Missing) 7956618
98.8%
ValueCountFrequency (%)
0 90662
1.1%
1 23
 
< 0.1%
1000 6
 
< 0.1%
2500 1
 
< 0.1%
4264.7 1
 
< 0.1%
ValueCountFrequency (%)
1750000000 2
 
< 0.1%
1300000000 504
< 0.1%
1268000000 1
 
< 0.1%
1135000000 1
 
< 0.1%
938159252.8 72
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct12031
Distinct (%)3.6%
Missing7720433
Missing (%)95.8%
Infinite0
Infinite (%)0.0%
Mean10095376.44
Minimum0
Maximum1.576941824 × 1010
Zeros293947
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:30.075357image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2304245.6
Maximum1.576941824 × 1010
Range1.576941824 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation106342638.4
Coefficient of variation (CV)10.53379623
Kurtosis5862.181633
Mean10095376.44
Median Absolute Deviation (MAD)0
Skewness46.93143835
Sum3.387533852 × 1012
Variance1.130875673 × 1016
MonotonicityNot monotonic
2024-02-13T20:51:30.250057image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 293947
 
3.6%
60000 907
 
< 0.1%
740000000 702
 
< 0.1%
130000 568
 
< 0.1%
100000 545
 
< 0.1%
975308630 464
 
< 0.1%
50000 417
 
< 0.1%
1300000000 396
 
< 0.1%
65000 333
 
< 0.1%
80000 319
 
< 0.1%
Other values (12021) 36955
 
0.5%
(Missing) 7720433
95.8%
ValueCountFrequency (%)
0 293947
3.6%
0.01 1
 
< 0.1%
0.95 1
 
< 0.1%
0.99 1
 
< 0.1%
1 107
 
< 0.1%
ValueCountFrequency (%)
1.576941824 × 10104
 
< 0.1%
3250000000 8
 
< 0.1%
3200000000 8
 
< 0.1%
2000000000 44
 
< 0.1%
1300000000 396
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:30.450256image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters64447888
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3cbe86ba
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7720433
95.8%
c7a5ad39 251636
 
3.1%
3cbe86ba 47608
 
0.6%
9276e4bb 13668
 
0.2%
0e63c0f0 8809
 
0.1%
168ad9f3 4344
 
0.1%
5224034a 2078
 
< 0.1%
5994c34a 1836
 
< 0.1%
7b62420e 1620
 
< 0.1%
940efad7 1514
 
< 0.1%
Other values (5) 2440
 
< 0.1%
2024-02-13T20:51:30.774907image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23417366
36.3%
a 8282191
 
12.9%
7 7990201
 
12.4%
b 7845443
 
12.2%
4 7745862
 
12.0%
1 7727542
 
12.0%
3 316311
 
0.5%
c 311530
 
0.5%
9 274834
 
0.4%
d 259096
 
0.4%
Other values (6) 277512
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47658485
73.9%
Lowercase Letter 16789403
 
26.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23417366
49.1%
7 7990201
 
16.8%
4 7745862
 
16.3%
1 7727542
 
16.2%
3 316311
 
0.7%
9 274834
 
0.6%
6 76887
 
0.2%
8 52947
 
0.1%
0 32745
 
0.1%
2 23790
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 8282191
49.3%
b 7845443
46.7%
c 311530
 
1.9%
d 259096
 
1.5%
e 73736
 
0.4%
f 17407
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 47658485
73.9%
Latin 16789403
 
26.1%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23417366
49.1%
7 7990201
 
16.8%
4 7745862
 
16.3%
1 7727542
 
16.2%
3 316311
 
0.7%
9 274834
 
0.6%
6 76887
 
0.2%
8 52947
 
0.1%
0 32745
 
0.1%
2 23790
 
< 0.1%
Latin
ValueCountFrequency (%)
a 8282191
49.3%
b 7845443
46.7%
c 311530
 
1.9%
d 259096
 
1.5%
e 73736
 
0.4%
f 17407
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23417366
36.3%
a 8282191
 
12.9%
7 7990201
 
12.4%
b 7845443
 
12.2%
4 7745862
 
12.0%
1 7727542
 
12.0%
3 316311
 
0.5%
c 311530
 
0.5%
9 274834
 
0.4%
d 259096
 
0.4%
Other values (6) 277512
 
0.4%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:30.961946image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters64447888
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7956618
98.8%
c7a5ad39 83555
 
1.0%
0e63c0f0 4714
 
0.1%
9276e4bb 4544
 
0.1%
168ad9f3 3835
 
< 0.1%
3cbe86ba 1242
 
< 0.1%
7b62420e 861
 
< 0.1%
940efad7 196
 
< 0.1%
2fd21cf1 148
 
< 0.1%
f4d8a027 113
 
< 0.1%
Other values (5) 160
 
< 0.1%
2024-02-13T20:51:31.259975image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23953565
37.2%
a 8129214
 
12.6%
7 8045955
 
12.5%
b 7969118
 
12.4%
4 7962522
 
12.4%
1 7960812
 
12.4%
3 93439
 
0.1%
9 92172
 
0.1%
c 89743
 
0.1%
d 87847
 
0.1%
Other values (6) 63501
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48151191
74.7%
Lowercase Letter 16296697
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23953565
49.7%
7 8045955
 
16.7%
4 7962522
 
16.5%
1 7960812
 
16.5%
3 93439
 
0.2%
9 92172
 
0.2%
0 15391
 
< 0.1%
6 15263
 
< 0.1%
2 6819
 
< 0.1%
8 5253
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 8129214
49.9%
b 7969118
48.9%
c 89743
 
0.6%
d 87847
 
0.5%
e 11620
 
0.1%
f 9155
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 48151191
74.7%
Latin 16296697
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23953565
49.7%
7 8045955
 
16.7%
4 7962522
 
16.5%
1 7960812
 
16.5%
3 93439
 
0.2%
9 92172
 
0.2%
0 15391
 
< 0.1%
6 15263
 
< 0.1%
2 6819
 
< 0.1%
8 5253
 
< 0.1%
Latin
ValueCountFrequency (%)
a 8129214
49.9%
b 7969118
48.9%
c 89743
 
0.6%
d 87847
 
0.5%
e 11620
 
0.1%
f 9155
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23953565
37.2%
a 8129214
 
12.6%
7 8045955
 
12.5%
b 7969118
 
12.4%
4 7962522
 
12.4%
1 7960812
 
12.4%
3 93439
 
0.1%
9 92172
 
0.1%
c 89743
 
0.1%
d 87847
 
0.1%
Other values (6) 63501
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct294
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.533827392
Minimum0
Maximum293
Zeros1462147
Zeros (%)18.1%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:31.423531image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q37
95-th percentile18
Maximum293
Range293
Interquartile range (IQR)6

Descriptive statistics

Standard deviation9.19827738
Coefficient of variation (CV)1.662190872
Kurtosis221.1698905
Mean5.533827392
Median Absolute Deviation (MAD)3
Skewness10.80047156
Sum44580436
Variance84.60830676
MonotonicityNot monotonic
2024-02-13T20:51:31.583053image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1462147
18.1%
1 1129278
14.0%
2 893984
11.1%
3 728098
9.0%
4 604925
7.5%
5 512020
 
6.4%
6 433456
 
5.4%
7 361361
 
4.5%
8 303040
 
3.8%
9 255339
 
3.2%
Other values (284) 1372338
17.0%
ValueCountFrequency (%)
0 1462147
18.1%
1 1129278
14.0%
2 893984
11.1%
3 728098
9.0%
4 604925
7.5%
ValueCountFrequency (%)
293 22
< 0.1%
292 22
< 0.1%
291 22
< 0.1%
290 22
< 0.1%
289 22
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.56430969
Minimum0
Maximum100
Zeros325360
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:32.040812image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum100
Range100
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.4929668
Coefficient of variation (CV)0.6998488691
Kurtosis0.2571096344
Mean13.56430969
Median Absolute Deviation (MAD)7
Skewness0.6049061605
Sum109273889
Variance90.11641867
MonotonicityNot monotonic
2024-02-13T20:51:32.205445image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 325360
 
4.0%
1 325333
 
4.0%
2 325333
 
4.0%
3 325333
 
4.0%
4 325333
 
4.0%
5 325333
 
4.0%
6 325333
 
4.0%
7 325333
 
4.0%
8 325333
 
4.0%
9 325333
 
4.0%
Other values (91) 4802629
59.6%
ValueCountFrequency (%)
0 325360
4.0%
1 325333
4.0%
2 325333
4.0%
3 325333
4.0%
4 325333
4.0%
ValueCountFrequency (%)
100 1
 
< 0.1%
99 1
 
< 0.1%
98 79
< 0.1%
97 79
< 0.1%
96 79
< 0.1%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3060
Distinct (%)0.3%
Missing6871133
Missing (%)85.3%
Infinite0
Infinite (%)0.0%
Mean12.11641529
Minimum0
Maximum4520
Zeros1115719
Zeros (%)13.8%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:32.363450image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4520
Range4520
Interquartile range (IQR)0

Descriptive statistics

Standard deviation146.1369082
Coefficient of variation (CV)12.06106796
Kurtosis359.4909907
Mean12.11641529
Median Absolute Deviation (MAD)0
Skewness17.57120307
Sum14356171
Variance21355.99592
MonotonicityNot monotonic
2024-02-13T20:51:32.518469image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1115719
 
13.8%
1 11291
 
0.1%
2 4384
 
0.1%
3 3939
 
< 0.1%
4 3536
 
< 0.1%
7 2052
 
< 0.1%
5 2032
 
< 0.1%
6 1932
 
< 0.1%
8 1684
 
< 0.1%
9 1614
 
< 0.1%
Other values (3050) 36670
 
0.5%
(Missing) 6871133
85.3%
ValueCountFrequency (%)
0 1115719
13.8%
1 11291
 
0.1%
2 4384
 
0.1%
3 3939
 
< 0.1%
4 3536
 
< 0.1%
ValueCountFrequency (%)
4520 1
< 0.1%
4507 1
< 0.1%
4475 1
< 0.1%
4454 1
< 0.1%
4422 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4042
Distinct (%)0.1%
Missing4608761
Missing (%)57.2%
Infinite0
Infinite (%)0.0%
Mean57.02591302
Minimum-12
Maximum84574
Zeros2884014
Zeros (%)35.8%
Negative531
Negative (%)< 0.1%
Memory size61.5 MiB
2024-02-13T20:51:32.674071image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-12
5-th percentile0
Q10
median0
Q30
95-th percentile360
Maximum84574
Range84586
Interquartile range (IQR)0

Descriptive statistics

Standard deviation282.7228433
Coefficient of variation (CV)4.957795997
Kurtosis9299.912895
Mean57.02591302
Median Absolute Deviation (MAD)0
Skewness36.80773901
Sum196581153
Variance79932.20611
MonotonicityNot monotonic
2024-02-13T20:51:32.824161image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2884014
35.8%
1 79122
 
1.0%
3 20657
 
0.3%
2 19157
 
0.2%
4 17056
 
0.2%
6 13852
 
0.2%
5 11317
 
0.1%
7 11133
 
0.1%
9 8081
 
0.1%
8 7862
 
0.1%
Other values (4032) 374974
 
4.7%
(Missing) 4608761
57.2%
ValueCountFrequency (%)
-12 1
 
< 0.1%
-9 1
 
< 0.1%
-8 7
 
< 0.1%
-7 13
< 0.1%
-6 23
< 0.1%
ValueCountFrequency (%)
84574 1
< 0.1%
84560 1
< 0.1%
84533 1
< 0.1%
84505 1
< 0.1%
4668 1
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing5585750
Missing (%)69.3%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:32.948093image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452053228
Coefficient of variation (CV)0.531085112
Kurtosis-1.216783251
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum16056534
Variance11.91667149
MonotonicityNot monotonic
2024-02-13T20:51:33.066130image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 205853
 
2.6%
3 205853
 
2.6%
4 205853
 
2.6%
5 205853
 
2.6%
6 205853
 
2.6%
7 205853
 
2.6%
8 205853
 
2.6%
9 205853
 
2.6%
10 205853
 
2.6%
11 205853
 
2.6%
Other values (2) 411706
 
5.1%
(Missing) 5585750
69.3%
ValueCountFrequency (%)
1 205853
2.6%
2 205853
2.6%
3 205853
2.6%
4 205853
2.6%
5 205853
2.6%
ValueCountFrequency (%)
12 205853
2.6%
11 205853
2.6%
10 205853
2.6%
9 205853
2.6%
8 205853
2.6%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing702590
Missing (%)8.7%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:33.186367image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052764
Coefficient of variation (CV)0.5310850407
Kurtosis-1.216783228
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum47797074
Variance11.91666829
MonotonicityNot monotonic
2024-02-13T20:51:33.302469image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 612783
7.6%
3 612783
7.6%
4 612783
7.6%
5 612783
7.6%
6 612783
7.6%
7 612783
7.6%
8 612783
7.6%
9 612783
7.6%
10 612783
7.6%
11 612783
7.6%
Other values (2) 1225566
15.2%
(Missing) 702590
8.7%
ValueCountFrequency (%)
1 612783
7.6%
2 612783
7.6%
3 612783
7.6%
4 612783
7.6%
5 612783
7.6%
ValueCountFrequency (%)
12 612783
7.6%
11 612783
7.6%
10 612783
7.6%
9 612783
7.6%
8 612783
7.6%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct52480
Distinct (%)4.4%
Missing6863527
Missing (%)85.2%
Infinite0
Infinite (%)0.0%
Mean2961.235791
Minimum0
Maximum149930380
Zeros1122141
Zeros (%)13.9%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:33.441467image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile529.833842
Maximum149930380
Range149930380
Interquartile range (IQR)0

Descriptive statistics

Standard deviation388144.9616
Coefficient of variation (CV)131.075331
Kurtosis131049.0785
Mean2961.235791
Median Absolute Deviation (MAD)0
Skewness344.9531162
Sum3531152270
Variance1.506565112 × 1011
MonotonicityNot monotonic
2024-02-13T20:51:33.602470image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1122141
 
13.9%
10 107
 
< 0.1%
14 107
 
< 0.1%
400 89
 
< 0.1%
3350.3801 72
 
< 0.1%
0.4 68
 
< 0.1%
1000 64
 
< 0.1%
28000 64
 
< 0.1%
20 56
 
< 0.1%
1 52
 
< 0.1%
Other values (52470) 69639
 
0.9%
(Missing) 6863527
85.2%
ValueCountFrequency (%)
0 1122141
13.9%
0.002 2
 
< 0.1%
0.004 5
 
< 0.1%
0.006 4
 
< 0.1%
0.008 5
 
< 0.1%
ValueCountFrequency (%)
149930380 7
 
< 0.1%
23045082 32
< 0.1%
15555000 18
< 0.1%
8150148 5
 
< 0.1%
5045893.5 3
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct280697
Distinct (%)8.1%
Missing4604283
Missing (%)57.2%
Infinite0
Infinite (%)0.0%
Mean4299.084887
Minimum0
Maximum46413904
Zeros2864665
Zeros (%)35.6%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:33.766550image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile16413.764
Maximum46413904
Range46413904
Interquartile range (IQR)0

Descriptive statistics

Standard deviation89787.06774
Coefficient of variation (CV)20.88515814
Kurtosis104914.0842
Mean4299.084887
Median Absolute Deviation (MAD)0
Skewness259.3446397
Sum1.48391642 × 1010
Variance8061717534
MonotonicityNot monotonic
2024-02-13T20:51:33.918570image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2864665
35.6%
0.2 1719
 
< 0.1%
1000 894
 
< 0.1%
0.4 854
 
< 0.1%
0.8 641
 
< 0.1%
2000 595
 
< 0.1%
3000 534
 
< 0.1%
2 519
 
< 0.1%
1.6 511
 
< 0.1%
0.6 479
 
< 0.1%
Other values (280687) 580292
 
7.2%
(Missing) 4604283
57.2%
ValueCountFrequency (%)
0 2864665
35.6%
0.002 39
 
< 0.1%
0.004 25
 
< 0.1%
0.006 17
 
< 0.1%
0.008 22
 
< 0.1%
ValueCountFrequency (%)
46413904 4
< 0.1%
27888398 2
< 0.1%
26678742 2
< 0.1%
24440374 4
< 0.1%
20743684 1
 
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing5585750
Missing (%)69.3%
Infinite0
Infinite (%)0.0%
Mean2019.345637
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:34.110098image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2018
Q12019
median2019
Q32020
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7899714889
Coefficient of variation (CV)0.0003912017212
Kurtosis-0.6975550172
Mean2019.345637
Median Absolute Deviation (MAD)1
Skewness-0.2526182713
Sum4988260289
Variance0.6240549533
MonotonicityNot monotonic
2024-02-13T20:51:34.228101image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 1073524
 
13.3%
2019 906216
 
11.2%
2018 399723
 
5.0%
2021 90404
 
1.1%
2017 303
 
< 0.1%
2016 66
 
< 0.1%
(Missing) 5585750
69.3%
ValueCountFrequency (%)
2016 66
 
< 0.1%
2017 303
 
< 0.1%
2018 399723
 
5.0%
2019 906216
11.2%
2020 1073524
13.3%
ValueCountFrequency (%)
2021 90404
 
1.1%
2020 1073524
13.3%
2019 906216
11.2%
2018 399723
 
5.0%
2017 303
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct22
Distinct (%)< 0.1%
Missing702590
Missing (%)8.7%
Infinite0
Infinite (%)0.0%
Mean2014.818264
Minimum2000
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:34.374722image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2007
Q12012
median2016
Q32018
95-th percentile2019
Maximum2021
Range21
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.971782424
Coefficient of variation (CV)0.001971285696
Kurtosis-0.4506686967
Mean2014.818264
Median Absolute Deviation (MAD)3
Skewness-0.763033686
Sum1.481575656 × 1010
Variance15.77505562
MonotonicityNot monotonic
2024-02-13T20:51:34.499611image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2018 1119912
13.9%
2019 1017290
12.6%
2017 864903
10.7%
2016 620562
7.7%
2015 547132
6.8%
2014 505515
 
6.3%
2013 436475
 
5.4%
2012 353630
 
4.4%
2011 304694
 
3.8%
2020 293980
 
3.6%
Other values (12) 1289303
16.0%
(Missing) 702590
8.7%
ValueCountFrequency (%)
2000 11
 
< 0.1%
2001 34
 
< 0.1%
2002 135
 
< 0.1%
2003 683
 
< 0.1%
2004 13954
0.2%
ValueCountFrequency (%)
2021 19104
 
0.2%
2020 293980
 
3.6%
2019 1017290
12.6%
2018 1119912
13.9%
2017 864903
10.7%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:34.656266image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000000248
Min length8

Characters and Unicode

Total characters64447890
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7736685
96.0%
ab3c25cf 312948
 
3.9%
15f04f45 3164
 
< 0.1%
be4fd70b 2204
 
< 0.1%
daf49a8a 965
 
< 0.1%
71ddaa88 10
 
< 0.1%
0c42a10e 6
 
< 0.1%
p28_48_88 2
 
< 0.1%
9ba4314a 1
 
< 0.1%
1d94eac1 1
 
< 0.1%
2024-02-13T20:51:34.939584image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23529331
36.5%
b 8054042
 
12.5%
a 8052557
 
12.5%
4 7746193
 
12.0%
1 7739868
 
12.0%
7 7738899
 
12.0%
c 625903
 
1.0%
f 322445
 
0.5%
2 312956
 
0.5%
3 312949
 
0.5%
Other values (7) 12747
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47387536
73.5%
Lowercase Letter 17060348
 
26.5%
Connector Punctuation 4
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23529331
49.7%
4 7746193
 
16.3%
1 7739868
 
16.3%
7 7738899
 
16.3%
2 312956
 
0.7%
3 312949
 
0.7%
0 5380
 
< 0.1%
8 993
 
< 0.1%
9 967
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 8054042
47.2%
a 8052557
47.2%
c 625903
 
3.7%
f 322445
 
1.9%
d 3190
 
< 0.1%
e 2211
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 47387540
73.5%
Latin 17060350
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23529331
49.7%
4 7746193
 
16.3%
1 7739868
 
16.3%
7 7738899
 
16.3%
2 312956
 
0.7%
3 312949
 
0.7%
0 5380
 
< 0.1%
8 993
 
< 0.1%
9 967
 
< 0.1%
_ 4
 
< 0.1%
Latin
ValueCountFrequency (%)
b 8054042
47.2%
a 8052557
47.2%
c 625903
 
3.7%
f 322445
 
1.9%
d 3190
 
< 0.1%
e 2211
 
< 0.1%
P 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447890
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23529331
36.5%
b 8054042
 
12.5%
a 8052557
 
12.5%
4 7746193
 
12.0%
1 7739868
 
12.0%
7 7738899
 
12.0%
c 625903
 
1.0%
f 322445
 
0.5%
2 312956
 
0.5%
3 312949
 
0.5%
Other values (7) 12747
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.5 MiB
2024-02-13T20:51:35.098993image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000013654
Min length8

Characters and Unicode

Total characters64447998
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7964485
98.9%
ab3c25cf 88586
 
1.1%
15f04f45 1250
 
< 0.1%
be4fd70b 1011
 
< 0.1%
daf49a8a 543
 
< 0.1%
p28_48_88 110
 
< 0.1%
71ddaa88 1
 
< 0.1%
2024-02-13T20:51:35.384028image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23984541
37.2%
b 8055093
 
12.5%
a 8054702
 
12.5%
4 7968649
 
12.4%
1 7965736
 
12.4%
7 7965497
 
12.4%
c 177172
 
0.3%
f 92640
 
0.1%
2 88696
 
0.1%
3 88586
 
0.1%
Other values (7) 6686
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48065494
74.6%
Lowercase Letter 16382174
 
25.4%
Connector Punctuation 220
 
< 0.1%
Uppercase Letter 110
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23984541
49.9%
4 7968649
 
16.6%
1 7965736
 
16.6%
7 7965497
 
16.6%
2 88696
 
0.2%
3 88586
 
0.2%
0 2261
 
< 0.1%
8 985
 
< 0.1%
9 543
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 8055093
49.2%
a 8054702
49.2%
c 177172
 
1.1%
f 92640
 
0.6%
d 1556
 
< 0.1%
e 1011
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 220
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48065714
74.6%
Latin 16382284
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23984541
49.9%
4 7968649
 
16.6%
1 7965736
 
16.6%
7 7965497
 
16.6%
2 88696
 
0.2%
3 88586
 
0.2%
0 2261
 
< 0.1%
8 985
 
< 0.1%
9 543
 
< 0.1%
_ 220
 
< 0.1%
Latin
ValueCountFrequency (%)
b 8055093
49.2%
a 8054702
49.2%
c 177172
 
1.1%
f 92640
 
0.6%
d 1556
 
< 0.1%
e 1011
 
< 0.1%
P 110
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64447998
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23984541
37.2%
b 8055093
 
12.5%
a 8054702
 
12.5%
4 7968649
 
12.4%
1 7965736
 
12.4%
7 7965497
 
12.4%
c 177172
 
0.3%
f 92640
 
0.1%
2 88696
 
0.1%
3 88586
 
0.1%
Other values (7) 6686
 
< 0.1%