Overview

Dataset statistics

Number of variables19
Number of observations17893536
Missing cells114892314
Missing cells (%)33.8%
Total size in memory2.5 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 17591919 (98.3%) missing valuesMissing
collater_valueofguarantee_876L has 17306035 (96.7%) missing valuesMissing
pmts_dpd_1073P has 14018715 (78.3%) missing valuesMissing
pmts_dpd_303P has 11481091 (64.2%) missing valuesMissing
pmts_month_158T has 10312896 (57.6%) missing valuesMissing
pmts_month_706T has 4192740 (23.4%) missing valuesMissing
pmts_overdue_1140A has 14008893 (78.3%) missing valuesMissing
pmts_overdue_1152A has 11474389 (64.1%) missing valuesMissing
pmts_year_1139T has 10312896 (57.6%) missing valuesMissing
pmts_year_507T has 4192740 (23.4%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 93.02870407)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 27.59180471)Skewed
pmts_dpd_303P is highly skewed (γ1 = 50.67888568)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 206.234823)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 224.8312526)Skewed
collater_valueofguarantee_1124L has 280131 (1.6%) zerosZeros
collater_valueofguarantee_876L has 516710 (2.9%) zerosZeros
num_group1 has 4888627 (27.3%) zerosZeros
num_group2 has 705366 (3.9%) zerosZeros
pmts_dpd_1073P has 3646455 (20.4%) zerosZeros
pmts_dpd_303P has 5378239 (30.1%) zerosZeros
pmts_overdue_1140A has 3652565 (20.4%) zerosZeros
pmts_overdue_1152A has 5336097 (29.8%) zerosZeros

Reproduction

Analysis started2024-02-13 19:44:47.246727
Analysis finished2024-02-13 19:45:28.041616
Duration40.79 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct156749
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1323076.851
Minimum13927
Maximum2593511
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:28.164614image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum13927
5-th percentile133731
Q1727795
median1399594
Q31427579
95-th percentile2588352
Maximum2593511
Range2579584
Interquartile range (IQR)699784

Descriptive statistics

Standard deviation702400.9226
Coefficient of variation (CV)0.5308844473
Kurtosis-0.1877919764
Mean1323076.851
Median Absolute Deviation (MAD)33577
Skewness0.1888454156
Sum2.367452326 × 1013
Variance4.933670561 × 1011
MonotonicityIncreasing
2024-02-13T20:45:28.335871image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1383556 4209
 
< 0.1%
1396592 3336
 
< 0.1%
140807 1752
 
< 0.1%
1404470 1704
 
< 0.1%
1425225 1680
 
< 0.1%
1390546 1524
 
< 0.1%
1424070 1488
 
< 0.1%
1395337 1440
 
< 0.1%
141012 1404
 
< 0.1%
1420355 1356
 
< 0.1%
Other values (156739) 17873643
99.9%
ValueCountFrequency (%)
13927 36
 
< 0.1%
13994 24
 
< 0.1%
14050 96
< 0.1%
14051 36
 
< 0.1%
14053 36
 
< 0.1%
ValueCountFrequency (%)
2593511 132
< 0.1%
2593508 108
< 0.1%
2593507 24
 
< 0.1%
2593505 240
< 0.1%
2593504 120
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:28.525584image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters143148288
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17591919
98.3%
9a0c095e 212129
 
1.2%
8fd95e4b 89356
 
0.5%
06fb9ba8 132
 
< 0.1%
2024-02-13T20:45:28.864057image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 53077242
37.1%
a 17804180
 
12.4%
b 17681539
 
12.4%
4 17681275
 
12.4%
7 17591919
 
12.3%
1 17591919
 
12.3%
9 513746
 
0.4%
0 424390
 
0.3%
e 301485
 
0.2%
c 212129
 
0.1%
Other values (4) 268464
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106970111
74.7%
Lowercase Letter 36178177
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 53077242
49.6%
4 17681275
 
16.5%
7 17591919
 
16.4%
1 17591919
 
16.4%
9 513746
 
0.5%
0 424390
 
0.4%
8 89488
 
0.1%
6 132
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 17804180
49.2%
b 17681539
48.9%
e 301485
 
0.8%
c 212129
 
0.6%
f 89488
 
0.2%
d 89356
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 106970111
74.7%
Latin 36178177
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 53077242
49.6%
4 17681275
 
16.5%
7 17591919
 
16.4%
1 17591919
 
16.4%
9 513746
 
0.5%
0 424390
 
0.4%
8 89488
 
0.1%
6 132
 
< 0.1%
Latin
ValueCountFrequency (%)
a 17804180
49.2%
b 17681539
48.9%
e 301485
 
0.8%
c 212129
 
0.6%
f 89488
 
0.2%
d 89356
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 53077242
37.1%
a 17804180
 
12.4%
b 17681539
 
12.4%
4 17681275
 
12.4%
7 17591919
 
12.3%
1 17591919
 
12.3%
9 513746
 
0.4%
0 424390
 
0.3%
e 301485
 
0.2%
c 212129
 
0.1%
Other values (4) 268464
 
0.2%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:29.022555image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters143148288
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17306035
96.7%
9a0c095e 331844
 
1.9%
8fd95e4b 254546
 
1.4%
06fb9ba8 963
 
< 0.1%
3cbe86ba 145
 
< 0.1%
c7a5ad39 2
 
< 0.1%
f4d8a027 1
 
< 0.1%
2024-02-13T20:45:29.321786image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 52504497
36.7%
a 17638992
 
12.3%
b 17562797
 
12.3%
4 17560582
 
12.3%
7 17306038
 
12.1%
1 17306035
 
12.1%
9 919199
 
0.6%
0 664652
 
0.5%
e 586535
 
0.4%
c 331991
 
0.2%
Other values (6) 766970
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106517914
74.4%
Lowercase Letter 36630374
 
25.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 52504497
49.3%
4 17560582
 
16.5%
7 17306038
 
16.2%
1 17306035
 
16.2%
9 919199
 
0.9%
0 664652
 
0.6%
8 255655
 
0.2%
6 1108
 
< 0.1%
3 147
 
< 0.1%
2 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 17638992
48.2%
b 17562797
47.9%
e 586535
 
1.6%
c 331991
 
0.9%
f 255510
 
0.7%
d 254549
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
Common 106517914
74.4%
Latin 36630374
 
25.6%

Most frequent character per script

Common
ValueCountFrequency (%)
5 52504497
49.3%
4 17560582
 
16.5%
7 17306038
 
16.2%
1 17306035
 
16.2%
9 919199
 
0.9%
0 664652
 
0.6%
8 255655
 
0.2%
6 1108
 
< 0.1%
3 147
 
< 0.1%
2 1
 
< 0.1%
Latin
ValueCountFrequency (%)
a 17638992
48.2%
b 17562797
47.9%
e 586535
 
1.6%
c 331991
 
0.9%
f 255510
 
0.7%
d 254549
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 52504497
36.7%
a 17638992
 
12.3%
b 17562797
 
12.3%
4 17560582
 
12.3%
7 17306038
 
12.1%
1 17306035
 
12.1%
9 919199
 
0.6%
0 664652
 
0.5%
e 586535
 
0.4%
c 331991
 
0.2%
Other values (6) 766970
 
0.5%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct14224
Distinct (%)4.7%
Missing17591919
Missing (%)98.3%
Infinite0
Infinite (%)0.0%
Mean1129784.99
Minimum0
Maximum3200000000
Zeros280131
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:29.475024image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3500000
Maximum3200000000
Range3200000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation24676664.46
Coefficient of variation (CV)21.84191212
Kurtosis10961.52147
Mean1129784.99
Median Absolute Deviation (MAD)0
Skewness93.02870407
Sum3.407623592 × 1011
Variance6.089377687 × 1014
MonotonicityNot monotonic
2024-02-13T20:45:29.644987image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 280131
 
1.6%
3000000 151
 
< 0.1%
5000000 136
 
< 0.1%
4000000 124
 
< 0.1%
2000000 121
 
< 0.1%
1 115
 
< 0.1%
10000000 108
 
< 0.1%
2500000 85
 
< 0.1%
3500000 79
 
< 0.1%
6000000 73
 
< 0.1%
Other values (14214) 20494
 
0.1%
(Missing) 17591919
98.3%
ValueCountFrequency (%)
0 280131
1.6%
0.02 1
 
< 0.1%
1 115
 
< 0.1%
5 1
 
< 0.1%
500 1
 
< 0.1%
ValueCountFrequency (%)
3200000000 11
< 0.1%
1912618083 3
 
< 0.1%
1325758000 1
 
< 0.1%
1267140000 1
 
< 0.1%
1000000000 1
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct24044
Distinct (%)4.1%
Missing17306035
Missing (%)96.7%
Infinite0
Infinite (%)0.0%
Mean3840788.337
Minimum0
Maximum4905062000
Zeros516710
Zeros (%)2.9%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:29.822024image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile228000
Maximum4905062000
Range4905062000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation69028116.68
Coefficient of variation (CV)17.97238239
Kurtosis963.9886887
Mean3840788.337
Median Absolute Deviation (MAD)0
Skewness27.59180471
Sum2.256466989 × 1012
Variance4.764880892 × 1015
MonotonicityNot monotonic
2024-02-13T20:45:29.985696image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 516710
 
2.9%
60000 1829
 
< 0.1%
130000 1443
 
< 0.1%
100000 1349
 
< 0.1%
50000 1035
 
< 0.1%
65000 888
 
< 0.1%
70000 656
 
< 0.1%
80000 617
 
< 0.1%
150000 580
 
< 0.1%
200000 576
 
< 0.1%
Other values (24034) 61818
 
0.3%
(Missing) 17306035
96.7%
ValueCountFrequency (%)
0 516710
2.9%
0.02 10
 
< 0.1%
0.03 12
 
< 0.1%
0.04 4
 
< 0.1%
0.06 4
 
< 0.1%
ValueCountFrequency (%)
4905062000 1
 
< 0.1%
3250000000 60
 
< 0.1%
3200000000 14
 
< 0.1%
2947009633 1
 
< 0.1%
2000000000 175
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:30.160489image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters143148288
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17306035
96.7%
c7a5ad39 433700
 
2.4%
3cbe86ba 108817
 
0.6%
9276e4bb 16582
 
0.1%
0e63c0f0 8085
 
< 0.1%
168ad9f3 4694
 
< 0.1%
5224034a 3499
 
< 0.1%
7b62420e 3401
 
< 0.1%
940efad7 3178
 
< 0.1%
5994c34a 1619
 
< 0.1%
Other values (5) 3926
 
< 0.1%
2024-02-13T20:45:30.464112image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 52357543
36.6%
a 18297596
 
12.8%
7 17765339
 
12.4%
b 17561469
 
12.3%
4 17341213
 
12.1%
1 17314399
 
12.1%
3 560414
 
0.4%
c 554366
 
0.4%
9 461392
 
0.3%
d 444263
 
0.3%
Other values (6) 490294
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106129696
74.1%
Lowercase Letter 37018592
 
25.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 52357543
49.3%
7 17765339
 
16.7%
4 17341213
 
16.3%
1 17314399
 
16.3%
3 560414
 
0.5%
9 461392
 
0.4%
6 142814
 
0.1%
8 115297
 
0.1%
0 36687
 
< 0.1%
2 34598
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 18297596
49.4%
b 17561469
47.4%
c 554366
 
1.5%
d 444263
 
1.2%
e 140683
 
0.4%
f 20215
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 106129696
74.1%
Latin 37018592
 
25.9%

Most frequent character per script

Common
ValueCountFrequency (%)
5 52357543
49.3%
7 17765339
 
16.7%
4 17341213
 
16.3%
1 17314399
 
16.3%
3 560414
 
0.5%
9 461392
 
0.4%
6 142814
 
0.1%
8 115297
 
0.1%
0 36687
 
< 0.1%
2 34598
 
< 0.1%
Latin
ValueCountFrequency (%)
a 18297596
49.4%
b 17561469
47.4%
c 554366
 
1.5%
d 444263
 
1.2%
e 140683
 
0.4%
f 20215
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 52357543
36.6%
a 18297596
 
12.8%
7 17765339
 
12.4%
b 17561469
 
12.3%
4 17341213
 
12.1%
1 17314399
 
12.1%
3 560414
 
0.4%
c 554366
 
0.4%
9 461392
 
0.3%
d 444263
 
0.3%
Other values (6) 490294
 
0.3%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:30.635441image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters143148288
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17591919
98.3%
c7a5ad39 273548
 
1.5%
9276e4bb 11440
 
0.1%
0e63c0f0 7695
 
< 0.1%
7b62420e 3074
 
< 0.1%
168ad9f3 3057
 
< 0.1%
940efad7 853
 
< 0.1%
3cbe86ba 510
 
< 0.1%
f4d8a027 466
 
< 0.1%
46ab00a7 333
 
< 0.1%
Other values (5) 641
 
< 0.1%
2024-02-13T20:45:30.949541image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 53049623
37.1%
a 18144816
 
12.7%
7 17881717
 
12.5%
b 17619305
 
12.3%
4 17608578
 
12.3%
1 17595686
 
12.3%
9 288976
 
0.2%
3 285054
 
0.2%
c 282184
 
0.2%
d 278242
 
0.2%
Other values (6) 114107
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106787383
74.6%
Lowercase Letter 36360905
 
25.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 53049623
49.7%
7 17881717
 
16.7%
4 17608578
 
16.5%
1 17595686
 
16.5%
9 288976
 
0.3%
3 285054
 
0.3%
0 28354
 
< 0.1%
6 26188
 
< 0.1%
2 19100
 
< 0.1%
8 4107
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 18144816
49.9%
b 17619305
48.5%
c 282184
 
0.8%
d 278242
 
0.8%
e 23646
 
0.1%
f 12712
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 106787383
74.6%
Latin 36360905
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 53049623
49.7%
7 17881717
 
16.7%
4 17608578
 
16.5%
1 17595686
 
16.5%
9 288976
 
0.3%
3 285054
 
0.3%
0 28354
 
< 0.1%
6 26188
 
< 0.1%
2 19100
 
< 0.1%
8 4107
 
< 0.1%
Latin
ValueCountFrequency (%)
a 18144816
49.9%
b 17619305
48.5%
c 282184
 
0.8%
d 278242
 
0.8%
e 23646
 
0.1%
f 12712
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 53049623
37.1%
a 18144816
 
12.7%
7 17881717
 
12.5%
b 17619305
 
12.3%
4 17608578
 
12.3%
1 17595686
 
12.3%
9 288976
 
0.2%
3 285054
 
0.2%
c 282184
 
0.2%
d 278242
 
0.2%
Other values (6) 114107
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct243
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.748604859
Minimum0
Maximum242
Zeros4888627
Zeros (%)27.3%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:31.101285image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q35
95-th percentile13
Maximum242
Range242
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.886697029
Coefficient of variation (CV)1.570370111
Kurtosis240.7655965
Mean3.748604859
Median Absolute Deviation (MAD)2
Skewness9.524976471
Sum67075796
Variance34.65320191
MonotonicityNot monotonic
2024-02-13T20:45:31.263646image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4888627
27.3%
1 3118173
17.4%
2 2077950
11.6%
3 1509896
 
8.4%
4 1171802
 
6.5%
5 948040
 
5.3%
6 772672
 
4.3%
7 629147
 
3.5%
8 511118
 
2.9%
9 414183
 
2.3%
Other values (233) 1851928
 
10.3%
ValueCountFrequency (%)
0 4888627
27.3%
1 3118173
17.4%
2 2077950
11.6%
3 1509896
 
8.4%
4 1171802
 
6.5%
ValueCountFrequency (%)
242 36
< 0.1%
241 24
< 0.1%
240 12
 
< 0.1%
239 24
< 0.1%
238 24
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.75626578
Minimum0
Maximum35
Zeros705366
Zeros (%)3.9%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:31.410045image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q321
95-th percentile32
Maximum35
Range35
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.444509069
Coefficient of variation (CV)0.6865605259
Kurtosis-0.7450781045
Mean13.75626578
Median Absolute Deviation (MAD)7
Skewness0.4526629906
Sum246148237
Variance89.19875156
MonotonicityNot monotonic
2024-02-13T20:45:31.556044image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 705366
 
3.9%
1 705282
 
3.9%
2 705279
 
3.9%
3 705278
 
3.9%
4 705278
 
3.9%
5 705277
 
3.9%
6 705277
 
3.9%
7 705277
 
3.9%
8 705277
 
3.9%
9 705277
 
3.9%
Other values (26) 10840668
60.6%
ValueCountFrequency (%)
0 705366
3.9%
1 705282
3.9%
2 705279
3.9%
3 705278
3.9%
4 705278
3.9%
ValueCountFrequency (%)
35 240097
1.3%
34 240097
1.3%
33 240097
1.3%
32 240097
1.3%
31 240097
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3636
Distinct (%)0.1%
Missing14018715
Missing (%)78.3%
Infinite0
Infinite (%)0.0%
Mean11.33832944
Minimum0
Maximum4455
Zeros3646455
Zeros (%)20.4%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:31.716068image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4455
Range4455
Interquartile range (IQR)0

Descriptive statistics

Standard deviation129.55736
Coefficient of variation (CV)11.4264946
Kurtosis350.6340232
Mean11.33832944
Median Absolute Deviation (MAD)0
Skewness16.91171771
Sum43933997
Variance16785.10954
MonotonicityNot monotonic
2024-02-13T20:45:31.869030image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3646455
 
20.4%
1 35584
 
0.2%
3 13830
 
0.1%
2 13671
 
0.1%
4 13102
 
0.1%
7 8038
 
< 0.1%
5 7333
 
< 0.1%
6 7257
 
< 0.1%
10 5482
 
< 0.1%
8 5470
 
< 0.1%
Other values (3626) 118599
 
0.7%
(Missing) 14018715
78.3%
ValueCountFrequency (%)
0 3646455
20.4%
1 35584
 
0.2%
2 13671
 
0.1%
3 13830
 
0.1%
4 13102
 
0.1%
ValueCountFrequency (%)
4455 1
< 0.1%
4445 1
< 0.1%
4423 1
< 0.1%
4391 1
< 0.1%
4365 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct3939
Distinct (%)0.1%
Missing11481091
Missing (%)64.2%
Infinite0
Infinite (%)0.0%
Mean52.53682815
Minimum-11
Maximum117000
Zeros5378239
Zeros (%)30.1%
Negative1070
Negative (%)< 0.1%
Memory size136.5 MiB
2024-02-13T20:45:32.016159image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-11
5-th percentile0
Q10
median0
Q30
95-th percentile316
Maximum117000
Range117011
Interquartile range (IQR)0

Descriptive statistics

Standard deviation271.1978107
Coefficient of variation (CV)5.162051465
Kurtosis15096.84724
Mean52.53682815
Median Absolute Deviation (MAD)0
Skewness50.67888568
Sum336889521
Variance73548.25254
MonotonicityNot monotonic
2024-02-13T20:45:32.177347image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5378239
30.1%
1 155702
 
0.9%
3 39667
 
0.2%
2 37077
 
0.2%
4 32617
 
0.2%
6 27457
 
0.2%
5 21786
 
0.1%
7 21024
 
0.1%
9 14802
 
0.1%
8 14588
 
0.1%
Other values (3929) 669486
 
3.7%
(Missing) 11481091
64.2%
ValueCountFrequency (%)
-11 1
 
< 0.1%
-10 10
< 0.1%
-9 12
< 0.1%
-8 6
< 0.1%
-7 3
 
< 0.1%
ValueCountFrequency (%)
117000 1
< 0.1%
84575 1
< 0.1%
84560 2
< 0.1%
84533 2
< 0.1%
84505 1
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing10312896
Missing (%)57.6%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:32.314341image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052757
Coefficient of variation (CV)0.5310850396
Kurtosis-1.216783228
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum49274160
Variance11.91666824
MonotonicityNot monotonic
2024-02-13T20:45:32.432460image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 631720
 
3.5%
3 631720
 
3.5%
4 631720
 
3.5%
5 631720
 
3.5%
6 631720
 
3.5%
7 631720
 
3.5%
8 631720
 
3.5%
9 631720
 
3.5%
10 631720
 
3.5%
11 631720
 
3.5%
Other values (2) 1263440
 
7.1%
(Missing) 10312896
57.6%
ValueCountFrequency (%)
1 631720
3.5%
2 631720
3.5%
3 631720
3.5%
4 631720
3.5%
5 631720
3.5%
ValueCountFrequency (%)
12 631720
3.5%
11 631720
3.5%
10 631720
3.5%
9 631720
3.5%
8 631720
3.5%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing4192740
Missing (%)23.4%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:32.550455image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052656
Coefficient of variation (CV)0.5310850239
Kurtosis-1.216783223
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum89055174
Variance11.91666754
MonotonicityNot monotonic
2024-02-13T20:45:32.670516image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 1141733
 
6.4%
3 1141733
 
6.4%
4 1141733
 
6.4%
5 1141733
 
6.4%
6 1141733
 
6.4%
7 1141733
 
6.4%
8 1141733
 
6.4%
9 1141733
 
6.4%
10 1141733
 
6.4%
11 1141733
 
6.4%
Other values (2) 2283466
12.8%
(Missing) 4192740
23.4%
ValueCountFrequency (%)
1 1141733
6.4%
2 1141733
6.4%
3 1141733
6.4%
4 1141733
6.4%
5 1141733
6.4%
ValueCountFrequency (%)
12 1141733
6.4%
11 1141733
6.4%
10 1141733
6.4%
9 1141733
6.4%
8 1141733
6.4%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct168579
Distinct (%)4.3%
Missing14008893
Missing (%)78.3%
Infinite0
Infinite (%)0.0%
Mean1683.38384
Minimum0
Maximum23891848
Zeros3652565
Zeros (%)20.4%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:32.813302image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile728.335
Maximum23891848
Range23891848
Interquartile range (IQR)0

Descriptive statistics

Standard deviation78014.54689
Coefficient of variation (CV)46.34388488
Kurtosis52939.56564
Mean1683.38384
Median Absolute Deviation (MAD)0
Skewness206.234823
Sum6539345250
Variance6086269527
MonotonicityNot monotonic
2024-02-13T20:45:32.992161image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3652565
 
20.4%
1000 441
 
< 0.1%
2000 306
 
< 0.1%
400 234
 
< 0.1%
10 232
 
< 0.1%
3000 230
 
< 0.1%
0.4 194
 
< 0.1%
2 193
 
< 0.1%
4 190
 
< 0.1%
1.6 186
 
< 0.1%
Other values (168569) 229872
 
1.3%
(Missing) 14008893
78.3%
ValueCountFrequency (%)
0 3652565
20.4%
0.002 15
 
< 0.1%
0.004 7
 
< 0.1%
0.006 13
 
< 0.1%
0.008 17
 
< 0.1%
ValueCountFrequency (%)
23891848 17
< 0.1%
17402200 7
< 0.1%
15768560 17
< 0.1%
12144611 10
< 0.1%
9278913 7
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct470005
Distinct (%)7.3%
Missing11474389
Missing (%)64.1%
Infinite0
Infinite (%)0.0%
Mean3933.6804
Minimum0
Maximum38038588
Zeros5336097
Zeros (%)29.8%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:33.145015image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15315.4308
Maximum38038588
Range38038588
Interquartile range (IQR)0

Descriptive statistics

Standard deviation71135.53397
Coefficient of variation (CV)18.08370959
Kurtosis72421.98985
Mean3933.6804
Median Absolute Deviation (MAD)0
Skewness224.8312526
Sum2.525087274 × 1010
Variance5060264194
MonotonicityNot monotonic
2024-02-13T20:45:33.311994image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5336097
29.8%
0.2 3645
 
< 0.1%
1000 2069
 
< 0.1%
0.4 1428
 
< 0.1%
2000 1319
 
< 0.1%
0.8 1121
 
< 0.1%
3000 1107
 
< 0.1%
1 1015
 
< 0.1%
2 1013
 
< 0.1%
1.6 1003
 
< 0.1%
Other values (469995) 1069330
 
6.0%
(Missing) 11474389
64.1%
ValueCountFrequency (%)
0 5336097
29.8%
0.002 97
 
< 0.1%
0.004 42
 
< 0.1%
0.006 25
 
< 0.1%
0.008 35
 
< 0.1%
ValueCountFrequency (%)
38038588 1
 
< 0.1%
32000000 2
 
< 0.1%
24400000 1
 
< 0.1%
24000000 7
< 0.1%
21444070 2
 
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing10312896
Missing (%)57.6%
Infinite0
Infinite (%)0.0%
Mean2018.374944
Minimum2015
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:33.444768image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2017
Q12018
median2018
Q32019
95-th percentile2019
Maximum2020
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7944665286
Coefficient of variation (CV)0.00039361692
Kurtosis-0.6926642171
Mean2018.374944
Median Absolute Deviation (MAD)1
Skewness-0.3098220355
Sum1.530057383 × 1010
Variance0.6311770651
MonotonicityNot monotonic
2024-02-13T20:45:33.564999image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2019 3463369
 
19.4%
2018 2614271
 
14.6%
2017 1208349
 
6.8%
2020 294154
 
1.6%
2016 475
 
< 0.1%
2015 22
 
< 0.1%
(Missing) 10312896
57.6%
ValueCountFrequency (%)
2015 22
 
< 0.1%
2016 475
 
< 0.1%
2017 1208349
 
6.8%
2018 2614271
14.6%
2019 3463369
19.4%
ValueCountFrequency (%)
2020 294154
 
1.6%
2019 3463369
19.4%
2018 2614271
14.6%
2017 1208349
 
6.8%
2016 475
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct20
Distinct (%)< 0.1%
Missing4192740
Missing (%)23.4%
Infinite0
Infinite (%)0.0%
Mean2013.929632
Minimum2001
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:33.676801image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2001
5-th percentile2007
Q12011
median2015
Q32017
95-th percentile2019
Maximum2020
Range19
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.849804474
Coefficient of variation (CV)0.001911588376
Kurtosis-0.670652374
Mean2013.929632
Median Absolute Deviation (MAD)3
Skewness-0.6230673825
Sum2.759243904 × 1010
Variance14.82099449
MonotonicityNot monotonic
2024-02-13T20:45:33.806177image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
2018 1915284
10.7%
2017 1795630
10.0%
2016 1368263
 
7.6%
2015 1219915
 
6.8%
2014 1132030
 
6.3%
2013 977085
 
5.5%
2019 828225
 
4.6%
2012 803078
 
4.5%
2011 694279
 
3.9%
2007 583142
 
3.3%
Other values (10) 2383865
13.3%
(Missing) 4192740
23.4%
ValueCountFrequency (%)
2001 44
 
< 0.1%
2002 422
 
< 0.1%
2003 1314
 
< 0.1%
2004 35173
 
0.2%
2005 167593
0.9%
ValueCountFrequency (%)
2020 60727
 
0.3%
2019 828225
4.6%
2018 1915284
10.7%
2017 1795630
10.0%
2016 1368263
7.6%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:33.981177image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters143148288
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17312772
96.8%
ab3c25cf 570195
 
3.2%
15f04f45 5649
 
< 0.1%
be4fd70b 2971
 
< 0.1%
daf49a8a 1918
 
< 0.1%
71ddaa88 13
 
< 0.1%
0c42a10e 11
 
< 0.1%
1d94eac1 6
 
< 0.1%
9ba4314a 1
 
< 0.1%
2024-02-13T20:45:34.277628image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 52519809
36.7%
b 17888910
 
12.5%
a 17888766
 
12.5%
4 17328978
 
12.1%
1 17318458
 
12.1%
7 17315756
 
12.1%
c 1140407
 
0.8%
f 586382
 
0.4%
2 570206
 
0.4%
3 570196
 
0.4%
Other values (5) 20420
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 105635914
73.8%
Lowercase Letter 37512374
 
26.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 52519809
49.7%
4 17328978
 
16.4%
1 17318458
 
16.4%
7 17315756
 
16.4%
2 570206
 
0.5%
3 570196
 
0.5%
0 8642
 
< 0.1%
8 1944
 
< 0.1%
9 1925
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 17888910
47.7%
a 17888766
47.7%
c 1140407
 
3.0%
f 586382
 
1.6%
d 4921
 
< 0.1%
e 2988
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 105635914
73.8%
Latin 37512374
 
26.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 52519809
49.7%
4 17328978
 
16.4%
1 17318458
 
16.4%
7 17315756
 
16.4%
2 570206
 
0.5%
3 570196
 
0.5%
0 8642
 
< 0.1%
8 1944
 
< 0.1%
9 1925
 
< 0.1%
Latin
ValueCountFrequency (%)
b 17888910
47.7%
a 17888766
47.7%
c 1140407
 
3.0%
f 586382
 
1.6%
d 4921
 
< 0.1%
e 2988
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 52519809
36.7%
b 17888910
 
12.5%
a 17888766
 
12.5%
4 17328978
 
12.1%
1 17318458
 
12.1%
7 17315756
 
12.1%
c 1140407
 
0.8%
f 586382
 
0.4%
2 570206
 
0.4%
3 570196
 
0.4%
Other values (5) 20420
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.5 MiB
2024-02-13T20:45:34.451167image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000028111
Min length8

Characters and Unicode

Total characters143148791
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 17597382
98.3%
ab3c25cf 288736
 
1.6%
be4fd70b 2861
 
< 0.1%
15f04f45 2055
 
< 0.1%
daf49a8a 1993
 
< 0.1%
p28_48_88 503
 
< 0.1%
71ddaa88 5
 
< 0.1%
0c42a10e 1
 
< 0.1%
2024-02-13T20:45:34.747983image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 53084992
37.1%
a 17892108
 
12.5%
b 17891840
 
12.5%
4 17606850
 
12.3%
7 17600248
 
12.3%
1 17599443
 
12.3%
c 577473
 
0.4%
f 297700
 
0.2%
2 289240
 
0.2%
3 288736
 
0.2%
Other values (7) 20161
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106480435
74.4%
Lowercase Letter 36666847
 
25.6%
Connector Punctuation 1006
 
< 0.1%
Uppercase Letter 503
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 53084992
49.9%
4 17606850
 
16.5%
7 17600248
 
16.5%
1 17599443
 
16.5%
2 289240
 
0.3%
3 288736
 
0.3%
0 4918
 
< 0.1%
8 4015
 
< 0.1%
9 1993
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 17892108
48.8%
b 17891840
48.8%
c 577473
 
1.6%
f 297700
 
0.8%
d 4864
 
< 0.1%
e 2862
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1006
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 503
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 106481441
74.4%
Latin 36667350
 
25.6%

Most frequent character per script

Common
ValueCountFrequency (%)
5 53084992
49.9%
4 17606850
 
16.5%
7 17600248
 
16.5%
1 17599443
 
16.5%
2 289240
 
0.3%
3 288736
 
0.3%
0 4918
 
< 0.1%
8 4015
 
< 0.1%
9 1993
 
< 0.1%
_ 1006
 
< 0.1%
Latin
ValueCountFrequency (%)
a 17892108
48.8%
b 17891840
48.8%
c 577473
 
1.6%
f 297700
 
0.8%
d 4864
 
< 0.1%
e 2862
 
< 0.1%
P 503
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143148791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 53084992
37.1%
a 17892108
 
12.5%
b 17891840
 
12.5%
4 17606850
 
12.3%
7 17600248
 
12.3%
1 17599443
 
12.3%
c 577473
 
0.4%
f 297700
 
0.2%
2 289240
 
0.2%
3 288736
 
0.2%
Other values (7) 20161
 
< 0.1%