Overview

Dataset statistics

Number of variables19
Number of observations27025737
Missing cells171502180
Missing cells (%)33.4%
Total size in memory3.8 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 26668832 (98.7%) missing valuesMissing
collater_valueofguarantee_876L has 25972861 (96.1%) missing valuesMissing
pmts_dpd_1073P has 22404198 (82.9%) missing valuesMissing
pmts_dpd_303P has 15460265 (57.2%) missing valuesMissing
pmts_month_158T has 19045197 (70.5%) missing valuesMissing
pmts_month_706T has 2535801 (9.4%) missing valuesMissing
pmts_overdue_1140A has 22386387 (82.8%) missing valuesMissing
pmts_overdue_1152A has 15447641 (57.2%) missing valuesMissing
pmts_year_1139T has 19045197 (70.5%) missing valuesMissing
pmts_year_507T has 2535801 (9.4%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 109.0660607)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 96.87724828)Skewed
pmts_dpd_303P is highly skewed (γ1 = 652.8624643)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 162.4262987)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 517.8094681)Skewed
collater_valueofguarantee_1124L has 331552 (1.2%) zerosZeros
collater_valueofguarantee_876L has 940755 (3.5%) zerosZeros
num_group1 has 5874002 (21.7%) zerosZeros
num_group2 has 1097116 (4.1%) zerosZeros
pmts_dpd_1073P has 4355343 (16.1%) zerosZeros
pmts_dpd_303P has 9676891 (35.8%) zerosZeros
pmts_overdue_1140A has 4368450 (16.2%) zerosZeros
pmts_overdue_1152A has 9605082 (35.5%) zerosZeros

Reproduction

Analysis started2024-02-13 19:47:41.481606
Analysis finished2024-02-13 19:48:21.705748
Duration40.22 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct190313
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1400985.619
Minimum29427
Maximum2640040
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:21.798725image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum29427
5-th percentile162139
Q1841397
median1564752
Q31606419
95-th percentile2632222
Maximum2640040
Range2610613
Interquartile range (IQR)765022

Descriptive statistics

Standard deviation703984.7381
Coefficient of variation (CV)0.5024924801
Kurtosis-0.2395500727
Mean1400985.619
Median Absolute Deviation (MAD)48978
Skewness-0.1660401373
Sum3.786266888 × 1013
Variance4.955945115 × 1011
MonotonicityIncreasing
2024-02-13T20:48:21.977972image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1621606 3780
 
< 0.1%
1582326 2529
 
< 0.1%
1589462 2484
 
< 0.1%
1557807 2316
 
< 0.1%
169356 2304
 
< 0.1%
1546440 2148
 
< 0.1%
1569046 2016
 
< 0.1%
1559959 1980
 
< 0.1%
164897 1932
 
< 0.1%
1603701 1800
 
< 0.1%
Other values (190303) 27002448
99.9%
ValueCountFrequency (%)
29427 12
 
< 0.1%
29465 120
< 0.1%
29486 36
 
< 0.1%
29500 72
< 0.1%
29508 24
 
< 0.1%
ValueCountFrequency (%)
2640040 72
< 0.1%
2640038 168
< 0.1%
2640036 12
 
< 0.1%
2640035 12
 
< 0.1%
2640034 12
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:22.456504image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters216205896
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26668832
98.7%
9a0c095e 248834
 
0.9%
8fd95e4b 107850
 
0.4%
06fb9ba8 221
 
< 0.1%
2024-02-13T20:48:22.728554image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 80363180
37.2%
a 26917887
 
12.5%
b 26777124
 
12.4%
4 26776682
 
12.4%
7 26668832
 
12.3%
1 26668832
 
12.3%
9 605739
 
0.3%
0 497889
 
0.2%
e 356684
 
0.2%
c 248834
 
0.1%
Other values (4) 324213
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 161689446
74.8%
Lowercase Letter 54516450
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 80363180
49.7%
4 26776682
 
16.6%
7 26668832
 
16.5%
1 26668832
 
16.5%
9 605739
 
0.4%
0 497889
 
0.3%
8 108071
 
0.1%
6 221
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26917887
49.4%
b 26777124
49.1%
e 356684
 
0.7%
c 248834
 
0.5%
f 108071
 
0.2%
d 107850
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 161689446
74.8%
Latin 54516450
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 80363180
49.7%
4 26776682
 
16.6%
7 26668832
 
16.5%
1 26668832
 
16.5%
9 605739
 
0.4%
0 497889
 
0.3%
8 108071
 
0.1%
6 221
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26917887
49.4%
b 26777124
49.1%
e 356684
 
0.7%
c 248834
 
0.5%
f 108071
 
0.2%
d 107850
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216205896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 80363180
37.2%
a 26917887
 
12.5%
b 26777124
 
12.4%
4 26776682
 
12.4%
7 26668832
 
12.3%
1 26668832
 
12.3%
9 605739
 
0.3%
0 497889
 
0.2%
e 356684
 
0.2%
c 248834
 
0.1%
Other values (4) 324213
 
0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:22.884551image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters216205896
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25972861
96.1%
9a0c095e 585801
 
2.2%
8fd95e4b 465155
 
1.7%
06fb9ba8 1667
 
< 0.1%
3cbe86ba 244
 
< 0.1%
c7a5ad39 6
 
< 0.1%
9276e4bb 3
 
< 0.1%
2024-02-13T20:48:23.162014image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78969545
36.5%
a 26560585
 
12.3%
b 26441844
 
12.2%
4 26438019
 
12.2%
7 25972870
 
12.0%
1 25972861
 
12.0%
9 1638433
 
0.8%
0 1173269
 
0.5%
e 1051203
 
0.5%
c 586051
 
0.3%
Other values (6) 1401216
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 160634230
74.3%
Lowercase Letter 55571666
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78969545
49.2%
4 26438019
 
16.5%
7 25972870
 
16.2%
1 25972861
 
16.2%
9 1638433
 
1.0%
0 1173269
 
0.7%
8 467066
 
0.3%
6 1914
 
< 0.1%
3 250
 
< 0.1%
2 3
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26560585
47.8%
b 26441844
47.6%
e 1051203
 
1.9%
c 586051
 
1.1%
f 466822
 
0.8%
d 465161
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 160634230
74.3%
Latin 55571666
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78969545
49.2%
4 26438019
 
16.5%
7 25972870
 
16.2%
1 25972861
 
16.2%
9 1638433
 
1.0%
0 1173269
 
0.7%
8 467066
 
0.3%
6 1914
 
< 0.1%
3 250
 
< 0.1%
2 3
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26560585
47.8%
b 26441844
47.6%
e 1051203
 
1.9%
c 586051
 
1.1%
f 466822
 
0.8%
d 465161
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216205896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78969545
36.5%
a 26560585
 
12.3%
b 26441844
 
12.2%
4 26438019
 
12.2%
7 25972870
 
12.0%
1 25972861
 
12.0%
9 1638433
 
0.8%
0 1173269
 
0.5%
e 1051203
 
0.5%
c 586051
 
0.3%
Other values (6) 1401216
 
0.6%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct16238
Distinct (%)4.5%
Missing26668832
Missing (%)98.7%
Infinite0
Infinite (%)0.0%
Mean1742642.654
Minimum0
Maximum9240000000
Zeros331552
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:23.316893image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3650000
Maximum9240000000
Range9240000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation56378655.16
Coefficient of variation (CV)32.35239022
Kurtosis14169.42106
Mean1742642.654
Median Absolute Deviation (MAD)0
Skewness109.0660607
Sum6.219578764 × 1011
Variance3.178552757 × 1015
MonotonicityNot monotonic
2024-02-13T20:48:23.469893image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 331552
 
1.2%
300000000 250
 
< 0.1%
5000000 169
 
< 0.1%
3000000 168
 
< 0.1%
4000000 164
 
< 0.1%
1 113
 
< 0.1%
2000000 113
 
< 0.1%
6000000 109
 
< 0.1%
10000000 98
 
< 0.1%
1000000 94
 
< 0.1%
Other values (16228) 24075
 
0.1%
(Missing) 26668832
98.7%
ValueCountFrequency (%)
0 331552
1.2%
1 113
 
< 0.1%
2 4
 
< 0.1%
100 1
 
< 0.1%
383 1
 
< 0.1%
ValueCountFrequency (%)
9240000000 3
< 0.1%
7723000000 5
< 0.1%
6178400000 5
< 0.1%
5366660988 1
 
< 0.1%
4768000000 2
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct36507
Distinct (%)3.5%
Missing25972861
Missing (%)96.1%
Infinite0
Infinite (%)0.0%
Mean2452025.834
Minimum0
Maximum1.4 × 1010
Zeros940755
Zeros (%)3.5%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:23.622893image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile150000
Maximum1.4 × 1010
Range1.4 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation91215442.91
Coefficient of variation (CV)37.2000334
Kurtosis13037.81762
Mean2452025.834
Median Absolute Deviation (MAD)0
Skewness96.87724828
Sum2.581679151 × 1012
Variance8.320257025 × 1015
MonotonicityNot monotonic
2024-02-13T20:48:23.780882image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 940755
 
3.5%
60000 3185
 
< 0.1%
130000 2455
 
< 0.1%
100000 2165
 
< 0.1%
50000 1564
 
< 0.1%
65000 1465
 
< 0.1%
70000 1084
 
< 0.1%
80000 1063
 
< 0.1%
150000 950
 
< 0.1%
200000 946
 
< 0.1%
Other values (36497) 97244
 
0.4%
(Missing) 25972861
96.1%
ValueCountFrequency (%)
0 940755
3.5%
0.01 44
 
< 0.1%
0.02 41
 
< 0.1%
0.03 45
 
< 0.1%
0.04 60
 
< 0.1%
ValueCountFrequency (%)
1.4 × 101022
 
< 0.1%
1 × 10108
 
< 0.1%
4000000000 26
 
< 0.1%
3250000000 87
< 0.1%
3200000000 13
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:23.950642image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters216205896
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25972861
96.1%
c7a5ad39 800470
 
3.0%
3cbe86ba 178913
 
0.7%
9276e4bb 25260
 
0.1%
0e63c0f0 14276
 
0.1%
168ad9f3 7722
 
< 0.1%
7b62420e 6345
 
< 0.1%
5224034a 5348
 
< 0.1%
940efad7 5188
 
< 0.1%
2fd21cf1 2991
 
< 0.1%
Other values (5) 6363
 
< 0.1%
2024-02-13T20:48:24.236486image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78728023
36.4%
a 27777183
 
12.8%
7 26814126
 
12.4%
b 26389856
 
12.2%
4 26027945
 
12.0%
1 25987760
 
12.0%
3 1009156
 
0.5%
c 1000272
 
0.5%
9 843494
 
0.4%
d 818003
 
0.4%
Other values (6) 810078
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 159954539
74.0%
Lowercase Letter 56251357
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78728023
49.2%
7 26814126
 
16.8%
4 26027945
 
16.3%
1 25987760
 
16.2%
3 1009156
 
0.6%
9 843494
 
0.5%
6 234820
 
0.1%
8 189462
 
0.1%
0 63493
 
< 0.1%
2 56260
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 27777183
49.4%
b 26389856
46.9%
c 1000272
 
1.8%
d 818003
 
1.5%
e 231177
 
0.4%
f 34866
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 159954539
74.0%
Latin 56251357
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78728023
49.2%
7 26814126
 
16.8%
4 26027945
 
16.3%
1 25987760
 
16.2%
3 1009156
 
0.6%
9 843494
 
0.5%
6 234820
 
0.1%
8 189462
 
0.1%
0 63493
 
< 0.1%
2 56260
 
< 0.1%
Latin
ValueCountFrequency (%)
a 27777183
49.4%
b 26389856
46.9%
c 1000272
 
1.8%
d 818003
 
1.5%
e 231177
 
0.4%
f 34866
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216205896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78728023
36.4%
a 27777183
 
12.8%
7 26814126
 
12.4%
b 26389856
 
12.2%
4 26027945
 
12.0%
1 25987760
 
12.0%
3 1009156
 
0.5%
c 1000272
 
0.5%
9 843494
 
0.4%
d 818003
 
0.4%
Other values (6) 810078
 
0.4%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:24.406121image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters216205896
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26668832
98.7%
c7a5ad39 323928
 
1.2%
9276e4bb 13573
 
0.1%
0e63c0f0 9205
 
< 0.1%
7b62420e 3849
 
< 0.1%
168ad9f3 3443
 
< 0.1%
940efad7 825
 
< 0.1%
f4d8a027 731
 
< 0.1%
2fd21cf1 436
 
< 0.1%
3cbe86ba 334
 
< 0.1%
Other values (5) 581
 
< 0.1%
2024-02-13T20:48:24.673704image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 80330813
37.2%
a 27322715
 
12.6%
7 27012009
 
12.5%
b 26700762
 
12.3%
4 26688630
 
12.3%
1 26673222
 
12.3%
9 341939
 
0.2%
3 337224
 
0.2%
c 334063
 
0.2%
d 329363
 
0.2%
Other values (6) 135156
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 161476052
74.7%
Lowercase Letter 54729844
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 80330813
49.7%
7 27012009
 
16.7%
4 26688630
 
16.5%
1 26673222
 
16.5%
9 341939
 
0.2%
3 337224
 
0.2%
0 33629
 
< 0.1%
6 30671
 
< 0.1%
2 23332
 
< 0.1%
8 4583
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 27322715
49.9%
b 26700762
48.8%
c 334063
 
0.6%
d 329363
 
0.6%
e 27861
 
0.1%
f 15080
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 161476052
74.7%
Latin 54729844
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 80330813
49.7%
7 27012009
 
16.7%
4 26688630
 
16.5%
1 26673222
 
16.5%
9 341939
 
0.2%
3 337224
 
0.2%
0 33629
 
< 0.1%
6 30671
 
< 0.1%
2 23332
 
< 0.1%
8 4583
 
< 0.1%
Latin
ValueCountFrequency (%)
a 27322715
49.9%
b 26700762
48.8%
c 334063
 
0.6%
d 329363
 
0.6%
e 27861
 
0.1%
f 15080
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216205896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 80330813
37.2%
a 27322715
 
12.6%
7 27012009
 
12.5%
b 26700762
 
12.3%
4 26688630
 
12.3%
1 26673222
 
12.3%
9 341939
 
0.2%
3 337224
 
0.2%
c 334063
 
0.2%
d 329363
 
0.2%
Other values (6) 135156
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct187
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.462336069
Minimum0
Maximum186
Zeros5874002
Zeros (%)21.7%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:24.821754image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile15
Maximum186
Range186
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.058499172
Coefficient of variation (CV)1.357696749
Kurtosis71.96864449
Mean4.462336069
Median Absolute Deviation (MAD)3
Skewness5.368202173
Sum120597921
Variance36.70541222
MonotonicityNot monotonic
2024-02-13T20:48:24.981421image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5874002
21.7%
1 4100102
15.2%
2 3127750
11.6%
3 2493467
9.2%
4 2028536
 
7.5%
5 1669231
 
6.2%
6 1375530
 
5.1%
7 1133101
 
4.2%
8 920849
 
3.4%
9 760350
 
2.8%
Other values (177) 3542819
13.1%
ValueCountFrequency (%)
0 5874002
21.7%
1 4100102
15.2%
2 3127750
11.6%
3 2493467
9.2%
4 2028536
 
7.5%
ValueCountFrequency (%)
186 12
< 0.1%
185 12
< 0.1%
184 12
< 0.1%
183 12
< 0.1%
182 12
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct79
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.50928202
Minimum0
Maximum78
Zeros1097116
Zeros (%)4.1%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:25.143480image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum78
Range78
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.401897997
Coefficient of variation (CV)0.6959583774
Kurtosis-0.7013637751
Mean13.50928202
Median Absolute Deviation (MAD)7
Skewness0.4874184528
Sum365098303
Variance88.39568594
MonotonicityNot monotonic
2024-02-13T20:48:25.311289image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1097116
 
4.1%
1 1096992
 
4.1%
2 1096990
 
4.1%
7 1096989
 
4.1%
11 1096989
 
4.1%
10 1096989
 
4.1%
8 1096989
 
4.1%
9 1096989
 
4.1%
6 1096989
 
4.1%
5 1096989
 
4.1%
Other values (69) 16055716
59.4%
ValueCountFrequency (%)
0 1097116
4.1%
1 1096992
4.1%
2 1096990
4.1%
3 1096989
4.1%
4 1096989
4.1%
ValueCountFrequency (%)
78 5
< 0.1%
77 5
< 0.1%
76 5
< 0.1%
75 5
< 0.1%
74 5
< 0.1%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3904
Distinct (%)0.1%
Missing22404198
Missing (%)82.9%
Infinite0
Infinite (%)0.0%
Mean12.1456015
Minimum0
Maximum4871
Zeros4355343
Zeros (%)16.1%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:25.476067image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4871
Range4871
Interquartile range (IQR)0

Descriptive statistics

Standard deviation139.1766797
Coefficient of variation (CV)11.45901912
Kurtosis335.8609959
Mean12.1456015
Median Absolute Deviation (MAD)0
Skewness16.68261473
Sum56131371
Variance19370.14819
MonotonicityNot monotonic
2024-02-13T20:48:25.627397image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4355343
 
16.1%
1 44061
 
0.2%
2 15838
 
0.1%
3 15811
 
0.1%
4 13633
 
0.1%
7 8922
 
< 0.1%
5 8285
 
< 0.1%
6 7593
 
< 0.1%
10 6254
 
< 0.1%
9 6199
 
< 0.1%
Other values (3894) 139600
 
0.5%
(Missing) 22404198
82.9%
ValueCountFrequency (%)
0 4355343
16.1%
1 44061
 
0.2%
2 15838
 
0.1%
3 15811
 
0.1%
4 13633
 
0.1%
ValueCountFrequency (%)
4871 1
< 0.1%
4860 1
< 0.1%
4823 1
< 0.1%
4808 1
< 0.1%
4796 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4246
Distinct (%)< 0.1%
Missing15460265
Missing (%)57.2%
Infinite0
Infinite (%)0.0%
Mean55.75764266
Minimum-16
Maximum657458
Zeros9676891
Zeros (%)35.8%
Negative2117
Negative (%)< 0.1%
Memory size206.2 MiB
2024-02-13T20:48:25.807826image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-16
5-th percentile0
Q10
median0
Q30
95-th percentile345
Maximum657458
Range657474
Interquartile range (IQR)0

Descriptive statistics

Standard deviation338.7117754
Coefficient of variation (CV)6.074714769
Kurtosis1231360.448
Mean55.75764266
Median Absolute Deviation (MAD)0
Skewness652.8624643
Sum644863455
Variance114725.6668
MonotonicityNot monotonic
2024-02-13T20:48:25.968130image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9676891
35.8%
1 274525
 
1.0%
3 70820
 
0.3%
2 67426
 
0.2%
4 58691
 
0.2%
6 48626
 
0.2%
5 38898
 
0.1%
7 38406
 
0.1%
9 27359
 
0.1%
8 26573
 
0.1%
Other values (4236) 1237257
 
4.6%
(Missing) 15460265
57.2%
ValueCountFrequency (%)
-16 5
 
< 0.1%
-15 21
< 0.1%
-14 2
 
< 0.1%
-12 32
< 0.1%
-11 44
< 0.1%
ValueCountFrequency (%)
657458 1
 
< 0.1%
84575 1
 
< 0.1%
84574 2
< 0.1%
84561 1
 
< 0.1%
84560 3
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing19045197
Missing (%)70.5%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:26.109986image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052746
Coefficient of variation (CV)0.5310850378
Kurtosis-1.216783227
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum51873510
Variance11.91666816
MonotonicityNot monotonic
2024-02-13T20:48:26.465023image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 665045
 
2.5%
3 665045
 
2.5%
4 665045
 
2.5%
5 665045
 
2.5%
6 665045
 
2.5%
7 665045
 
2.5%
8 665045
 
2.5%
9 665045
 
2.5%
10 665045
 
2.5%
11 665045
 
2.5%
Other values (2) 1330090
 
4.9%
(Missing) 19045197
70.5%
ValueCountFrequency (%)
1 665045
2.5%
2 665045
2.5%
3 665045
2.5%
4 665045
2.5%
5 665045
2.5%
ValueCountFrequency (%)
12 665045
2.5%
11 665045
2.5%
10 665045
2.5%
9 665045
2.5%
8 665045
2.5%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing2535801
Missing (%)9.4%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:26.577860image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.4520526
Coefficient of variation (CV)0.5310850154
Kurtosis-1.21678322
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum159184584
Variance11.91666715
MonotonicityNot monotonic
2024-02-13T20:48:26.689518image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 2040828
7.6%
3 2040828
7.6%
4 2040828
7.6%
5 2040828
7.6%
6 2040828
7.6%
7 2040828
7.6%
8 2040828
7.6%
9 2040828
7.6%
10 2040828
7.6%
11 2040828
7.6%
Other values (2) 4081656
15.1%
(Missing) 2535801
9.4%
ValueCountFrequency (%)
1 2040828
7.6%
2 2040828
7.6%
3 2040828
7.6%
4 2040828
7.6%
5 2040828
7.6%
ValueCountFrequency (%)
12 2040828
7.6%
11 2040828
7.6%
10 2040828
7.6%
9 2040828
7.6%
8 2040828
7.6%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct194945
Distinct (%)4.2%
Missing22386387
Missing (%)82.8%
Infinite0
Infinite (%)0.0%
Mean1466.269079
Minimum0
Maximum24141158
Zeros4368450
Zeros (%)16.2%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:26.824370image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile586.5822
Maximum24141158
Range24141158
Interquartile range (IQR)0

Descriptive statistics

Standard deviation50337.00649
Coefficient of variation (CV)34.3299925
Kurtosis39472.96971
Mean1466.269079
Median Absolute Deviation (MAD)0
Skewness162.4262987
Sum6802535454
Variance2533814223
MonotonicityNot monotonic
2024-02-13T20:48:26.980283image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4368450
 
16.2%
1000 336
 
< 0.1%
10 326
 
< 0.1%
400 316
 
< 0.1%
2000 282
 
< 0.1%
14 236
 
< 0.1%
2 216
 
< 0.1%
0.8 203
 
< 0.1%
0.2 193
 
< 0.1%
7000 186
 
< 0.1%
Other values (194935) 268606
 
1.0%
(Missing) 22386387
82.8%
ValueCountFrequency (%)
0 4368450
16.2%
0.002 19
 
< 0.1%
0.004 9
 
< 0.1%
0.006 11
 
< 0.1%
0.008 12
 
< 0.1%
ValueCountFrequency (%)
24141158 1
 
< 0.1%
18772666 1
 
< 0.1%
14662000 2
 
< 0.1%
11790203 8
< 0.1%
11124970 8
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct792020
Distinct (%)6.8%
Missing15447641
Missing (%)57.2%
Infinite0
Infinite (%)0.0%
Mean4478.004494
Minimum0
Maximum229478910
Zeros9605082
Zeros (%)35.5%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:27.130265image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15538.23625
Maximum229478910
Range229478910
Interquartile range (IQR)0

Descriptive statistics

Standard deviation255161.9448
Coefficient of variation (CV)56.98117211
Kurtosis336355.6504
Mean4478.004494
Median Absolute Deviation (MAD)0
Skewness517.8094681
Sum5.184676592 × 1010
Variance6.510761806 × 1010
MonotonicityNot monotonic
2024-02-13T20:48:27.280264image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9605082
35.5%
0.2 6360
 
< 0.1%
1000 3884
 
< 0.1%
0.4 2783
 
< 0.1%
2000 2513
 
< 0.1%
3000 2072
 
< 0.1%
0.8 1880
 
< 0.1%
2 1789
 
< 0.1%
1.6 1686
 
< 0.1%
0.6 1684
 
< 0.1%
Other values (792010) 1948363
 
7.2%
(Missing) 15447641
57.2%
ValueCountFrequency (%)
0 9605082
35.5%
0.002 151
 
< 0.1%
0.004 89
 
< 0.1%
0.006 79
 
< 0.1%
0.008 80
 
< 0.1%
ValueCountFrequency (%)
229478910 1
< 0.1%
226099790 1
< 0.1%
216087540 1
< 0.1%
205762380 1
< 0.1%
153404140 1
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct8
Distinct (%)< 0.1%
Missing19045197
Missing (%)70.5%
Infinite0
Infinite (%)0.0%
Mean2018.437557
Minimum2012
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:27.422727image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2012
5-th percentile2017
Q12018
median2019
Q32019
95-th percentile2019
Maximum2020
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.803426591
Coefficient of variation (CV)0.0003980438176
Kurtosis-0.6182478225
Mean2018.437557
Median Absolute Deviation (MAD)0
Skewness-0.4474831864
Sum1.610822166 × 1010
Variance0.6454942872
MonotonicityNot monotonic
2024-02-13T20:48:27.550876image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2019 4038955
 
14.9%
2018 2348867
 
8.7%
2017 1243708
 
4.6%
2020 348700
 
1.3%
2016 286
 
< 0.1%
2013 12
 
< 0.1%
2012 11
 
< 0.1%
2014 1
 
< 0.1%
(Missing) 19045197
70.5%
ValueCountFrequency (%)
2012 11
 
< 0.1%
2013 12
 
< 0.1%
2014 1
 
< 0.1%
2016 286
 
< 0.1%
2017 1243708
4.6%
ValueCountFrequency (%)
2020 348700
 
1.3%
2019 4038955
14.9%
2018 2348867
8.7%
2017 1243708
 
4.6%
2016 286
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct21
Distinct (%)< 0.1%
Missing2535801
Missing (%)9.4%
Infinite0
Infinite (%)0.0%
Mean2014.264023
Minimum2000
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:27.682862image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2007
Q12012
median2015
Q32018
95-th percentile2019
Maximum2020
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.903209899
Coefficient of variation (CV)0.001937784647
Kurtosis-0.601635646
Mean2014.264023
Median Absolute Deviation (MAD)3
Skewness-0.6815756752
Sum4.9329197 × 1010
Variance15.23504752
MonotonicityNot monotonic
2024-02-13T20:48:27.825862image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2018 3678944
13.6%
2017 3120412
11.5%
2019 2387378
8.8%
2016 2282674
8.4%
2015 2031412
7.5%
2014 1878994
7.0%
2013 1620095
6.0%
2012 1329096
 
4.9%
2011 1149118
 
4.3%
2007 969742
 
3.6%
Other values (11) 4042071
15.0%
(Missing) 2535801
9.4%
ValueCountFrequency (%)
2000 11
 
< 0.1%
2001 122
 
< 0.1%
2002 495
 
< 0.1%
2003 1628
 
< 0.1%
2004 56563
0.2%
ValueCountFrequency (%)
2020 188830
 
0.7%
2019 2387378
8.8%
2018 3678944
13.6%
2017 3120412
11.5%
2016 2282674
8.4%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:27.988854image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000000037
Min length8

Characters and Unicode

Total characters216205897
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25979261
96.1%
ab3c25cf 1028669
 
3.8%
15f04f45 9668
 
< 0.1%
be4fd70b 4719
 
< 0.1%
daf49a8a 3359
 
< 0.1%
0c42a10e 28
 
< 0.1%
1d94eac1 12
 
< 0.1%
71ddaa88 10
 
< 0.1%
652d52e3 10
 
< 0.1%
p28_48_88 1
 
< 0.1%
2024-02-13T20:48:28.259706image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 78985808
36.5%
a 27018067
 
12.5%
b 27017368
 
12.5%
4 26006716
 
12.0%
1 25988991
 
12.0%
7 25983990
 
12.0%
c 2057378
 
1.0%
f 1056083
 
0.5%
2 1028718
 
0.5%
3 1028679
 
0.5%
Other values (8) 34099
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 159044109
73.6%
Lowercase Letter 57161785
 
26.4%
Connector Punctuation 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 78985808
49.7%
4 26006716
 
16.4%
1 25988991
 
16.3%
7 25983990
 
16.3%
2 1028718
 
0.6%
3 1028679
 
0.6%
0 14443
 
< 0.1%
8 3383
 
< 0.1%
9 3371
 
< 0.1%
6 10
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 27018067
47.3%
b 27017368
47.3%
c 2057378
 
3.6%
f 1056083
 
1.8%
d 8120
 
< 0.1%
e 4769
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 159044111
73.6%
Latin 57161786
 
26.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 78985808
49.7%
4 26006716
 
16.4%
1 25988991
 
16.3%
7 25983990
 
16.3%
2 1028718
 
0.6%
3 1028679
 
0.6%
0 14443
 
< 0.1%
8 3383
 
< 0.1%
9 3371
 
< 0.1%
6 10
 
< 0.1%
Latin
ValueCountFrequency (%)
a 27018067
47.3%
b 27017368
47.3%
c 2057378
 
3.6%
f 1056083
 
1.8%
d 8120
 
< 0.1%
e 4769
 
< 0.1%
P 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216205897
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 78985808
36.5%
a 27018067
 
12.5%
b 27017368
 
12.5%
4 26006716
 
12.0%
1 25988991
 
12.0%
7 25983990
 
12.0%
c 2057378
 
1.0%
f 1056083
 
0.5%
2 1028718
 
0.5%
3 1028679
 
0.5%
Other values (8) 34099
 
< 0.1%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size206.2 MiB
2024-02-13T20:48:28.407932image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000017021
Min length8

Characters and Unicode

Total characters216206356
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 26674689
98.7%
ab3c25cf 342439
 
1.3%
be4fd70b 3356
 
< 0.1%
15f04f45 2442
 
< 0.1%
daf49a8a 2348
 
< 0.1%
p28_48_88 460
 
< 0.1%
652d52e3 1
 
< 0.1%
71ddaa88 1
 
< 0.1%
0c42a10e 1
 
< 0.1%
2024-02-13T20:48:28.691762image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 80371392
37.2%
a 27024175
 
12.5%
b 27023840
 
12.5%
4 26685738
 
12.3%
7 26678046
 
12.3%
1 26677133
 
12.3%
c 684879
 
0.3%
f 353027
 
0.2%
2 342902
 
0.2%
3 342440
 
0.2%
Other values (8) 22784
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 161109990
74.5%
Lowercase Letter 55094986
 
25.5%
Connector Punctuation 920
 
< 0.1%
Uppercase Letter 460
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 80371392
49.9%
4 26685738
 
16.6%
7 26678046
 
16.6%
1 26677133
 
16.6%
2 342902
 
0.2%
3 342440
 
0.2%
0 5800
 
< 0.1%
8 4190
 
< 0.1%
9 2348
 
< 0.1%
6 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 27024175
49.1%
b 27023840
49.0%
c 684879
 
1.2%
f 353027
 
0.6%
d 5707
 
< 0.1%
e 3358
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 920
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 460
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 161110910
74.5%
Latin 55095446
 
25.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 80371392
49.9%
4 26685738
 
16.6%
7 26678046
 
16.6%
1 26677133
 
16.6%
2 342902
 
0.2%
3 342440
 
0.2%
0 5800
 
< 0.1%
8 4190
 
< 0.1%
9 2348
 
< 0.1%
_ 920
 
< 0.1%
Latin
ValueCountFrequency (%)
a 27024175
49.0%
b 27023840
49.0%
c 684879
 
1.2%
f 353027
 
0.6%
d 5707
 
< 0.1%
e 3358
 
< 0.1%
P 460
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216206356
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 80371392
37.2%
a 27024175
 
12.5%
b 27023840
 
12.5%
4 26685738
 
12.3%
7 26678046
 
12.3%
1 26677133
 
12.3%
c 684879
 
0.3%
f 353027
 
0.2%
2 342902
 
0.2%
3 342440
 
0.2%
Other values (8) 22784
 
< 0.1%