Overview

Dataset statistics

Number of variables19
Number of observations7861809
Missing cells53702386
Missing cells (%)36.0%
Total size in memory1.1 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 7617072 (96.9%) missing valuesMissing
collater_valueofguarantee_876L has 7782726 (99.0%) missing valuesMissing
pmts_dpd_1073P has 4801989 (61.1%) missing valuesMissing
pmts_dpd_303P has 6980863 (88.8%) missing valuesMissing
pmts_month_158T has 1367481 (17.4%) missing valuesMissing
pmts_month_706T has 6004125 (76.4%) missing valuesMissing
pmts_overdue_1140A has 4796427 (61.0%) missing valuesMissing
pmts_overdue_1152A has 6980097 (88.8%) missing valuesMissing
pmts_year_1139T has 1367481 (17.4%) missing valuesMissing
pmts_year_507T has 6004125 (76.4%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 39.83155035)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 46.03171702)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 270.4731214)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 82.61699864)Skewed
collater_valueofguarantee_1124L has 226797 (2.9%) zerosZeros
num_group1 has 3681491 (46.8%) zerosZeros
num_group2 has 293211 (3.7%) zerosZeros
pmts_dpd_1073P has 2881359 (36.7%) zerosZeros
pmts_dpd_303P has 743453 (9.5%) zerosZeros
pmts_overdue_1140A has 2884201 (36.7%) zerosZeros
pmts_overdue_1152A has 737231 (9.4%) zerosZeros

Reproduction

Analysis started2024-02-13 19:43:48.281764
Analysis finished2024-02-13 19:44:12.604902
Duration24.32 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct118481
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1237554.335
Minimum6683
Maximum2570525
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:12.723768image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum6683
5-th percentile116518
Q1665420
median1313748
Q31349578
95-th percentile2562168
Maximum2570525
Range2563842
Interquartile range (IQR)684158

Descriptive statistics

Standard deviation754506.3288
Coefficient of variation (CV)0.6096753147
Kurtosis-0.4839484479
Mean1237554.335
Median Absolute Deviation (MAD)630452
Skewness0.3443767808
Sum9.72941581 × 1012
Variance5.692798002 × 1011
MonotonicityIncreasing
2024-02-13T20:44:12.892195image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1308352 2232
 
< 0.1%
674520 1368
 
< 0.1%
1311727 1272
 
< 0.1%
116967 1056
 
< 0.1%
1314455 1020
 
< 0.1%
2551851 984
 
< 0.1%
1308414 912
 
< 0.1%
1313138 912
 
< 0.1%
2552322 888
 
< 0.1%
1314098 876
 
< 0.1%
Other values (118471) 7850289
99.9%
ValueCountFrequency (%)
6683 72
< 0.1%
6791 24
 
< 0.1%
6817 36
< 0.1%
6888 60
< 0.1%
6985 24
 
< 0.1%
ValueCountFrequency (%)
2570525 12
 
< 0.1%
2570522 48
< 0.1%
2570521 96
< 0.1%
2570520 36
 
< 0.1%
2570519 72
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:13.055246image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters62894472
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8fd95e4b
2nd row8fd95e4b
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7617072
96.9%
9a0c095e 166646
 
2.1%
8fd95e4b 77979
 
1.0%
06fb9ba8 112
 
< 0.1%
2024-02-13T20:44:13.338023image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23095841
36.7%
a 7783830
 
12.4%
b 7695275
 
12.2%
4 7695051
 
12.2%
7 7617072
 
12.1%
1 7617072
 
12.1%
9 411383
 
0.7%
0 333404
 
0.5%
e 244625
 
0.4%
c 166646
 
0.3%
Other values (4) 234273
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46848026
74.5%
Lowercase Letter 16046446
 
25.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23095841
49.3%
4 7695051
 
16.4%
7 7617072
 
16.3%
1 7617072
 
16.3%
9 411383
 
0.9%
0 333404
 
0.7%
8 78091
 
0.2%
6 112
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 7783830
48.5%
b 7695275
48.0%
e 244625
 
1.5%
c 166646
 
1.0%
f 78091
 
0.5%
d 77979
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 46848026
74.5%
Latin 16046446
 
25.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23095841
49.3%
4 7695051
 
16.4%
7 7617072
 
16.3%
1 7617072
 
16.3%
9 411383
 
0.9%
0 333404
 
0.7%
8 78091
 
0.2%
6 112
 
< 0.1%
Latin
ValueCountFrequency (%)
a 7783830
48.5%
b 7695275
48.0%
e 244625
 
1.5%
c 166646
 
1.0%
f 78091
 
0.5%
d 77979
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23095841
36.7%
a 7783830
 
12.4%
b 7695275
 
12.2%
4 7695051
 
12.2%
7 7617072
 
12.1%
1 7617072
 
12.1%
9 411383
 
0.7%
0 333404
 
0.5%
e 244625
 
0.4%
c 166646
 
0.3%
Other values (4) 234273
 
0.4%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:13.507023image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters62894472
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7782726
99.0%
9a0c095e 45432
 
0.6%
8fd95e4b 33548
 
0.4%
06fb9ba8 87
 
< 0.1%
3cbe86ba 16
 
< 0.1%
2024-02-13T20:44:13.786279image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23427158
37.2%
a 7828261
 
12.4%
b 7816480
 
12.4%
4 7816274
 
12.4%
7 7782726
 
12.4%
1 7782726
 
12.4%
9 124499
 
0.2%
0 90951
 
0.1%
e 78996
 
0.1%
c 45448
 
0.1%
Other values (5) 100953
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47058104
74.8%
Lowercase Letter 15836368
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23427158
49.8%
4 7816274
 
16.6%
7 7782726
 
16.5%
1 7782726
 
16.5%
9 124499
 
0.3%
0 90951
 
0.2%
8 33651
 
0.1%
6 103
 
< 0.1%
3 16
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 7828261
49.4%
b 7816480
49.4%
e 78996
 
0.5%
c 45448
 
0.3%
f 33635
 
0.2%
d 33548
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 47058104
74.8%
Latin 15836368
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23427158
49.8%
4 7816274
 
16.6%
7 7782726
 
16.5%
1 7782726
 
16.5%
9 124499
 
0.3%
0 90951
 
0.2%
8 33651
 
0.1%
6 103
 
< 0.1%
3 16
 
< 0.1%
Latin
ValueCountFrequency (%)
a 7828261
49.4%
b 7816480
49.4%
e 78996
 
0.5%
c 45448
 
0.3%
f 33635
 
0.2%
d 33548
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23427158
37.2%
a 7828261
 
12.4%
b 7816480
 
12.4%
4 7816274
 
12.4%
7 7782726
 
12.4%
1 7782726
 
12.4%
9 124499
 
0.2%
0 90951
 
0.1%
e 78996
 
0.1%
c 45448
 
0.1%
Other values (5) 100953
 
0.2%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct11467
Distinct (%)4.7%
Missing7617072
Missing (%)96.9%
Infinite0
Infinite (%)0.0%
Mean7284052.824
Minimum0
Maximum1.16 × 1010
Zeros226797
Zeros (%)2.9%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:13.938338image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3975522.4
Maximum1.16 × 1010
Range1.16 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation214334935.6
Coefficient of variation (CV)29.42523081
Kurtosis1716.403271
Mean7284052.824
Median Absolute Deviation (MAD)0
Skewness39.83155035
Sum1.782677236 × 1012
Variance4.59394646 × 1016
MonotonicityNot monotonic
2024-02-13T20:44:14.103451image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 226797
 
2.9%
100000000 236
 
< 0.1%
800000000 147
 
< 0.1%
5000000 142
 
< 0.1%
4000000 139
 
< 0.1%
20000000 132
 
< 0.1%
6804986362 128
 
< 0.1%
1 119
 
< 0.1%
3000000 112
 
< 0.1%
2000000 93
 
< 0.1%
Other values (11457) 16692
 
0.2%
(Missing) 7617072
96.9%
ValueCountFrequency (%)
0 226797
2.9%
1 119
 
< 0.1%
1038 1
 
< 0.1%
2295.3 1
 
< 0.1%
2300.09 1
 
< 0.1%
ValueCountFrequency (%)
1.16 × 101030
 
< 0.1%
9240000000 3
 
< 0.1%
8500000000 5
 
< 0.1%
7535965329 7
 
< 0.1%
6804986362 128
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED 

Distinct4138
Distinct (%)5.2%
Missing7782726
Missing (%)99.0%
Infinite0
Infinite (%)0.0%
Mean1536282.899
Minimum0
Maximum3250000000
Zeros70374
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:14.520674image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile140489.4
Maximum3250000000
Range3250000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation53679046.21
Coefficient of variation (CV)34.94086033
Kurtosis2321.351008
Mean1536282.899
Median Absolute Deviation (MAD)0
Skewness46.03171702
Sum1.214938605 × 1011
Variance2.881440002 × 1015
MonotonicityNot monotonic
2024-02-13T20:44:14.720597image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 70374
 
0.9%
60000 281
 
< 0.1%
130000 230
 
< 0.1%
100000 210
 
< 0.1%
65000 137
 
< 0.1%
50000 123
 
< 0.1%
70000 94
 
< 0.1%
40000 93
 
< 0.1%
150000 89
 
< 0.1%
300000 79
 
< 0.1%
Other values (4128) 7373
 
0.1%
(Missing) 7782726
99.0%
ValueCountFrequency (%)
0 70374
0.9%
0.02 1
 
< 0.1%
1 25
 
< 0.1%
1.8 1
 
< 0.1%
66 1
 
< 0.1%
ValueCountFrequency (%)
3250000000 7
 
< 0.1%
3200000000 3
 
< 0.1%
2000000000 26
< 0.1%
1200000000 7
 
< 0.1%
840000000 4
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:14.915123image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters62894472
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7782726
99.0%
c7a5ad39 58977
 
0.8%
3cbe86ba 14846
 
0.2%
9276e4bb 1885
 
< 0.1%
0e63c0f0 865
 
< 0.1%
168ad9f3 603
 
< 0.1%
940efad7 415
 
< 0.1%
5224034a 401
 
< 0.1%
7b62420e 380
 
< 0.1%
2fd21cf1 225
 
< 0.1%
Other values (5) 486
 
< 0.1%
2024-02-13T20:44:15.204210image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23407840
37.2%
a 7917398
 
12.6%
7 7844697
 
12.5%
b 7816754
 
12.4%
4 7786764
 
12.4%
1 7783886
 
12.4%
3 75869
 
0.1%
c 75197
 
0.1%
9 62234
 
0.1%
d 60343
 
0.1%
Other values (6) 63490
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47003821
74.7%
Lowercase Letter 15890651
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23407840
49.8%
7 7844697
 
16.7%
4 7786764
 
16.6%
1 7783886
 
16.6%
3 75869
 
0.2%
9 62234
 
0.1%
6 18765
 
< 0.1%
8 15679
 
< 0.1%
0 4067
 
< 0.1%
2 4020
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 7917398
49.8%
b 7816754
49.2%
c 75197
 
0.5%
d 60343
 
0.4%
e 18498
 
0.1%
f 2461
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 47003821
74.7%
Latin 15890651
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23407840
49.8%
7 7844697
 
16.7%
4 7786764
 
16.6%
1 7783886
 
16.6%
3 75869
 
0.2%
9 62234
 
0.1%
6 18765
 
< 0.1%
8 15679
 
< 0.1%
0 4067
 
< 0.1%
2 4020
 
< 0.1%
Latin
ValueCountFrequency (%)
a 7917398
49.8%
b 7816754
49.2%
c 75197
 
0.5%
d 60343
 
0.4%
e 18498
 
0.1%
f 2461
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23407840
37.2%
a 7917398
 
12.6%
7 7844697
 
12.5%
b 7816754
 
12.4%
4 7786764
 
12.4%
1 7783886
 
12.4%
3 75869
 
0.1%
c 75197
 
0.1%
9 62234
 
0.1%
d 60343
 
0.1%
Other values (6) 63490
 
0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:15.381352image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters62894472
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9276e4bb
2nd row0e63c0f0
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7617072
96.9%
c7a5ad39 218630
 
2.8%
9276e4bb 9654
 
0.1%
0e63c0f0 9165
 
0.1%
7b62420e 2332
 
< 0.1%
168ad9f3 2257
 
< 0.1%
f4d8a027 886
 
< 0.1%
940efad7 582
 
< 0.1%
3cbe86ba 468
 
< 0.1%
2fd21cf1 272
 
< 0.1%
Other values (5) 491
 
< 0.1%
2024-02-13T20:44:15.674349image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23070075
36.7%
a 8059231
 
12.8%
7 7849465
 
12.5%
b 7639952
 
12.1%
4 7631162
 
12.1%
1 7619915
 
12.1%
9 231217
 
0.4%
3 230707
 
0.4%
c 228624
 
0.4%
d 222627
 
0.4%
Other values (6) 111497
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46708356
74.3%
Lowercase Letter 16186116
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23070075
49.4%
7 7849465
 
16.8%
4 7631162
 
16.3%
1 7619915
 
16.3%
9 231217
 
0.5%
3 230707
 
0.5%
0 31954
 
0.1%
6 24180
 
0.1%
2 16028
 
< 0.1%
8 3653
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 8059231
49.8%
b 7639952
47.2%
c 228624
 
1.4%
d 222627
 
1.4%
e 22243
 
0.1%
f 13439
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 46708356
74.3%
Latin 16186116
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23070075
49.4%
7 7849465
 
16.8%
4 7631162
 
16.3%
1 7619915
 
16.3%
9 231217
 
0.5%
3 230707
 
0.5%
0 31954
 
0.1%
6 24180
 
0.1%
2 16028
 
< 0.1%
8 3653
 
< 0.1%
Latin
ValueCountFrequency (%)
a 8059231
49.8%
b 7639952
47.2%
c 228624
 
1.4%
d 222627
 
1.4%
e 22243
 
0.1%
f 13439
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23070075
36.7%
a 8059231
 
12.8%
7 7849465
 
12.5%
b 7639952
 
12.1%
4 7631162
 
12.1%
1 7619915
 
12.1%
9 231217
 
0.4%
3 230707
 
0.4%
c 228624
 
0.4%
d 222627
 
0.4%
Other values (6) 111497
 
0.2%

num_group1
Real number (ℝ)

ZEROS 

Distinct156
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.615250892
Minimum0
Maximum155
Zeros3681491
Zeros (%)46.8%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:15.838350image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile7
Maximum155
Range155
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.428795025
Coefficient of variation (CV)2.122763121
Kurtosis242.4986747
Mean1.615250892
Median Absolute Deviation (MAD)1
Skewness9.645424258
Sum12698794
Variance11.75663533
MonotonicityNot monotonic
2024-02-13T20:44:16.004019image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3681491
46.8%
1 1923280
24.5%
2 896845
 
11.4%
3 416891
 
5.3%
4 221835
 
2.8%
5 149725
 
1.9%
6 111421
 
1.4%
7 88204
 
1.1%
8 71462
 
0.9%
9 57493
 
0.7%
Other values (146) 243162
 
3.1%
ValueCountFrequency (%)
0 3681491
46.8%
1 1923280
24.5%
2 896845
 
11.4%
3 416891
 
5.3%
4 221835
 
2.8%
ValueCountFrequency (%)
155 12
< 0.1%
154 12
< 0.1%
153 12
< 0.1%
152 12
< 0.1%
151 12
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct98
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.1832741
Minimum0
Maximum97
Zeros293211
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:16.170161image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median13
Q321
95-th percentile32
Maximum97
Range97
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.489662499
Coefficient of variation (CV)0.6690741806
Kurtosis-0.7686121212
Mean14.1832741
Median Absolute Deviation (MAD)7
Skewness0.3974180425
Sum111506192
Variance90.05369434
MonotonicityNot monotonic
2024-02-13T20:44:16.331323image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 293211
 
3.7%
7 293201
 
3.7%
11 293201
 
3.7%
10 293201
 
3.7%
9 293201
 
3.7%
8 293201
 
3.7%
1 293201
 
3.7%
6 293201
 
3.7%
4 293201
 
3.7%
3 293201
 
3.7%
Other values (88) 4929789
62.7%
ValueCountFrequency (%)
0 293211
3.7%
1 293201
3.7%
2 293201
3.7%
3 293201
3.7%
4 293201
3.7%
ValueCountFrequency (%)
97 1
< 0.1%
96 1
< 0.1%
95 2
< 0.1%
94 2
< 0.1%
93 2
< 0.1%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3396
Distinct (%)0.1%
Missing4801989
Missing (%)61.1%
Infinite0
Infinite (%)0.0%
Mean10.98231791
Minimum0
Maximum4799
Zeros2881359
Zeros (%)36.7%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:16.487281image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4799
Range4799
Interquartile range (IQR)0

Descriptive statistics

Standard deviation124.6111472
Coefficient of variation (CV)11.34652522
Kurtosis352.9585524
Mean10.98231791
Median Absolute Deviation (MAD)0
Skewness16.87802932
Sum33603916
Variance15527.938
MonotonicityNot monotonic
2024-02-13T20:44:16.646266image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2881359
36.7%
1 26488
 
0.3%
3 11992
 
0.2%
2 10968
 
0.1%
4 10623
 
0.1%
7 6016
 
0.1%
5 5808
 
0.1%
6 5570
 
0.1%
9 4291
 
0.1%
8 4210
 
0.1%
Other values (3386) 92495
 
1.2%
(Missing) 4801989
61.1%
ValueCountFrequency (%)
0 2881359
36.7%
1 26488
 
0.3%
2 10968
 
0.1%
3 11992
 
0.2%
4 10623
 
0.1%
ValueCountFrequency (%)
4799 1
< 0.1%
4783 1
< 0.1%
4756 1
< 0.1%
4730 1
< 0.1%
4703 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  ZEROS 

Distinct3331
Distinct (%)0.4%
Missing6980863
Missing (%)88.8%
Infinite0
Infinite (%)0.0%
Mean49.15501858
Minimum-6
Maximum4897
Zeros743453
Zeros (%)9.5%
Negative147
Negative (%)< 0.1%
Memory size60.0 MiB
2024-02-13T20:44:16.840036image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-6
5-th percentile0
Q10
median0
Q30
95-th percentile261
Maximum4897
Range4903
Interquartile range (IQR)0

Descriptive statistics

Standard deviation243.8767224
Coefficient of variation (CV)4.961379925
Kurtosis65.69383365
Mean49.15501858
Median Absolute Deviation (MAD)0
Skewness7.26241723
Sum43302917
Variance59475.85573
MonotonicityNot monotonic
2024-02-13T20:44:16.998806image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 743453
 
9.5%
1 21931
 
0.3%
3 5164
 
0.1%
2 4974
 
0.1%
4 4272
 
0.1%
6 3722
 
< 0.1%
7 2919
 
< 0.1%
5 2890
 
< 0.1%
8 1998
 
< 0.1%
9 1963
 
< 0.1%
Other values (3321) 87660
 
1.1%
(Missing) 6980863
88.8%
ValueCountFrequency (%)
-6 3
 
< 0.1%
-5 2
 
< 0.1%
-4 3
 
< 0.1%
-3 25
< 0.1%
-2 46
< 0.1%
ValueCountFrequency (%)
4897 1
< 0.1%
4883 1
< 0.1%
4853 1
< 0.1%
4828 1
< 0.1%
4798 1
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing1367481
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:17.126601image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052795
Coefficient of variation (CV)0.5310850454
Kurtosis-1.21678323
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum42213132
Variance11.9166685
MonotonicityNot monotonic
2024-02-13T20:44:17.241212image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 541194
 
6.9%
3 541194
 
6.9%
4 541194
 
6.9%
5 541194
 
6.9%
6 541194
 
6.9%
7 541194
 
6.9%
8 541194
 
6.9%
9 541194
 
6.9%
10 541194
 
6.9%
11 541194
 
6.9%
Other values (2) 1082388
13.8%
(Missing) 1367481
17.4%
ValueCountFrequency (%)
1 541194
6.9%
2 541194
6.9%
3 541194
6.9%
4 541194
6.9%
5 541194
6.9%
ValueCountFrequency (%)
12 541194
6.9%
11 541194
6.9%
10 541194
6.9%
9 541194
6.9%
8 541194
6.9%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing6004125
Missing (%)76.4%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:17.359223image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452053459
Coefficient of variation (CV)0.5310851475
Kurtosis-1.216783262
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum12074946
Variance11.91667308
MonotonicityNot monotonic
2024-02-13T20:44:17.477471image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 154807
 
2.0%
3 154807
 
2.0%
4 154807
 
2.0%
5 154807
 
2.0%
6 154807
 
2.0%
7 154807
 
2.0%
8 154807
 
2.0%
9 154807
 
2.0%
10 154807
 
2.0%
11 154807
 
2.0%
Other values (2) 309614
 
3.9%
(Missing) 6004125
76.4%
ValueCountFrequency (%)
1 154807
2.0%
2 154807
2.0%
3 154807
2.0%
4 154807
2.0%
5 154807
2.0%
ValueCountFrequency (%)
12 154807
2.0%
11 154807
2.0%
10 154807
2.0%
9 154807
2.0%
8 154807
2.0%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct131804
Distinct (%)4.3%
Missing4796427
Missing (%)61.0%
Infinite0
Infinite (%)0.0%
Mean1924.279352
Minimum0
Maximum49261336
Zeros2884201
Zeros (%)36.7%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:17.621689image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile550.6471025
Maximum49261336
Range49261336
Interquartile range (IQR)0

Descriptive statistics

Standard deviation157002.3223
Coefficient of variation (CV)81.59019223
Kurtosis80443.74089
Mean1924.279352
Median Absolute Deviation (MAD)0
Skewness270.4731214
Sum5898651290
Variance2.46497292 × 1010
MonotonicityNot monotonic
2024-02-13T20:44:17.790372image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2884201
36.7%
1000 357
 
< 0.1%
2000 229
 
< 0.1%
3900 213
 
< 0.1%
3000 204
 
< 0.1%
400 201
 
< 0.1%
0.4 194
 
< 0.1%
4 172
 
< 0.1%
2 166
 
< 0.1%
2600 158
 
< 0.1%
Other values (131794) 179287
 
2.3%
(Missing) 4796427
61.0%
ValueCountFrequency (%)
0 2884201
36.7%
0.002 25
 
< 0.1%
0.004 5
 
< 0.1%
0.006 10
 
< 0.1%
0.008 9
 
< 0.1%
ValueCountFrequency (%)
49261336 20
< 0.1%
49005736 4
 
< 0.1%
33724950 1
 
< 0.1%
32733930 1
 
< 0.1%
31796478 1
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct79465
Distinct (%)9.0%
Missing6980097
Missing (%)88.8%
Infinite0
Infinite (%)0.0%
Mean3255.019038
Minimum0
Maximum5908453.5
Zeros737231
Zeros (%)9.4%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:17.946372image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile13826.31335
Maximum5908453.5
Range5908453.5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation32183.74714
Coefficient of variation (CV)9.887422089
Kurtosis11673.7185
Mean3255.019038
Median Absolute Deviation (MAD)0
Skewness82.61699864
Sum2869989346
Variance1035793580
MonotonicityNot monotonic
2024-02-13T20:44:18.105372image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 737231
 
9.4%
0.2 434
 
< 0.1%
1000 354
 
< 0.1%
0.4 233
 
< 0.1%
2000 231
 
< 0.1%
0.8 181
 
< 0.1%
1.6 170
 
< 0.1%
3000 155
 
< 0.1%
2 146
 
< 0.1%
1.8000001 137
 
< 0.1%
Other values (79455) 142440
 
1.8%
(Missing) 6980097
88.8%
ValueCountFrequency (%)
0 737231
9.4%
0.002 9
 
< 0.1%
0.004 3
 
< 0.1%
0.006 13
 
< 0.1%
0.008 4
 
< 0.1%
ValueCountFrequency (%)
5908453.5 7
< 0.1%
3783489.2 1
 
< 0.1%
3515548.8 9
< 0.1%
2832545 4
< 0.1%
2802019.2 2
 
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing1367481
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean2018.346633
Minimum2016
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:18.231373image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2017
Q12018
median2018
Q32019
95-th percentile2019
Maximum2020
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7875883614
Coefficient of variation (CV)0.0003902146186
Kurtosis-0.6971782139
Mean2018.346633
Median Absolute Deviation (MAD)1
Skewness-0.2483920452
Sum1.310780505 × 1010
Variance0.620295427
MonotonicityNot monotonic
2024-02-13T20:44:18.349436image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%)
2019 2818069
35.8%
2018 2398307
30.5%
2017 1040385
 
13.2%
2020 237149
 
3.0%
2016 418
 
< 0.1%
(Missing) 1367481
17.4%
ValueCountFrequency (%)
2016 418
 
< 0.1%
2017 1040385
 
13.2%
2018 2398307
30.5%
2019 2818069
35.8%
2020 237149
 
3.0%
ValueCountFrequency (%)
2020 237149
 
3.0%
2019 2818069
35.8%
2018 2398307
30.5%
2017 1040385
 
13.2%
2016 418
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing6004125
Missing (%)76.4%
Infinite0
Infinite (%)0.0%
Mean2013.756449
Minimum2002
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:18.485433image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2002
5-th percentile2007
Q12011
median2015
Q32017
95-th percentile2018
Maximum2020
Range18
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.769275702
Coefficient of variation (CV)0.001871763442
Kurtosis-0.6536546618
Mean2013.756449
Median Absolute Deviation (MAD)2
Skewness-0.6327057514
Sum3740923135
Variance14.20743932
MonotonicityNot monotonic
2024-02-13T20:44:18.615437image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2017 254390
 
3.2%
2018 247432
 
3.1%
2016 203298
 
2.6%
2015 176984
 
2.3%
2014 160913
 
2.0%
2013 139123
 
1.8%
2012 112345
 
1.4%
2011 95720
 
1.2%
2008 81422
 
1.0%
2007 81270
 
1.0%
Other values (9) 304787
 
3.9%
(Missing) 6004125
76.4%
ValueCountFrequency (%)
2002 33
 
< 0.1%
2003 201
 
< 0.1%
2004 5012
 
0.1%
2005 23400
0.3%
2006 57526
0.7%
ValueCountFrequency (%)
2020 3732
 
< 0.1%
2019 61585
 
0.8%
2018 247432
3.1%
2017 254390
3.2%
2016 203298
2.6%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:18.770844image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters62894472
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7783000
99.0%
ab3c25cf 77446
 
1.0%
15f04f45 780
 
< 0.1%
be4fd70b 363
 
< 0.1%
daf49a8a 216
 
< 0.1%
1d94eac1 3
 
< 0.1%
0c42a10e 1
 
< 0.1%
2024-02-13T20:44:19.073389image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23428006
37.2%
b 7861172
 
12.5%
a 7861098
 
12.5%
4 7785143
 
12.4%
1 7783787
 
12.4%
7 7783363
 
12.4%
c 154896
 
0.2%
f 79585
 
0.1%
2 77447
 
0.1%
3 77446
 
0.1%
Other values (5) 2529
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46936772
74.6%
Lowercase Letter 15957700
 
25.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23428006
49.9%
4 7785143
 
16.6%
1 7783787
 
16.6%
7 7783363
 
16.6%
2 77447
 
0.2%
3 77446
 
0.2%
0 1145
 
< 0.1%
9 219
 
< 0.1%
8 216
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 7861172
49.3%
a 7861098
49.3%
c 154896
 
1.0%
f 79585
 
0.5%
d 582
 
< 0.1%
e 367
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 46936772
74.6%
Latin 15957700
 
25.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23428006
49.9%
4 7785143
 
16.6%
1 7783787
 
16.6%
7 7783363
 
16.6%
2 77447
 
0.2%
3 77446
 
0.2%
0 1145
 
< 0.1%
9 219
 
< 0.1%
8 216
 
< 0.1%
Latin
ValueCountFrequency (%)
b 7861172
49.3%
a 7861098
49.3%
c 154896
 
1.0%
f 79585
 
0.5%
d 582
 
< 0.1%
e 367
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23428006
37.2%
b 7861172
 
12.5%
a 7861098
 
12.5%
4 7785143
 
12.4%
1 7783787
 
12.4%
7 7783363
 
12.4%
c 154896
 
0.2%
f 79585
 
0.1%
2 77447
 
0.1%
3 77446
 
0.1%
Other values (5) 2529
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.0 MiB
2024-02-13T20:44:19.228794image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000042611
Min length8

Characters and Unicode

Total characters62894807
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 7623096
97.0%
ab3c25cf 230303
 
2.9%
be4fd70b 4924
 
0.1%
15f04f45 1596
 
< 0.1%
daf49a8a 1550
 
< 0.1%
p28_48_88 335
 
< 0.1%
0c42a10e 3
 
< 0.1%
71ddaa88 2
 
< 0.1%
2024-02-13T20:44:19.510464image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 23102783
36.7%
b 7863247
 
12.5%
a 7858056
 
12.5%
4 7633100
 
12.1%
7 7628022
 
12.1%
1 7624697
 
12.1%
c 460609
 
0.7%
f 239969
 
0.4%
2 230641
 
0.4%
3 230303
 
0.4%
Other values (7) 23380
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46460516
73.9%
Lowercase Letter 16433286
 
26.1%
Connector Punctuation 670
 
< 0.1%
Uppercase Letter 335
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 23102783
49.7%
4 7633100
 
16.4%
7 7628022
 
16.4%
1 7624697
 
16.4%
2 230641
 
0.5%
3 230303
 
0.5%
0 6526
 
< 0.1%
8 2894
 
< 0.1%
9 1550
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 7863247
47.8%
a 7858056
47.8%
c 460609
 
2.8%
f 239969
 
1.5%
d 6478
 
< 0.1%
e 4927
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 670
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 335
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 46461186
73.9%
Latin 16433621
 
26.1%

Most frequent character per script

Common
ValueCountFrequency (%)
5 23102783
49.7%
4 7633100
 
16.4%
7 7628022
 
16.4%
1 7624697
 
16.4%
2 230641
 
0.5%
3 230303
 
0.5%
0 6526
 
< 0.1%
8 2894
 
< 0.1%
9 1550
 
< 0.1%
_ 670
 
< 0.1%
Latin
ValueCountFrequency (%)
b 7863247
47.8%
a 7858056
47.8%
c 460609
 
2.8%
f 239969
 
1.5%
d 6478
 
< 0.1%
e 4927
 
< 0.1%
P 335
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62894807
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 23102783
36.7%
b 7863247
 
12.5%
a 7858056
 
12.5%
4 7633100
 
12.1%
7 7628022
 
12.1%
1 7624697
 
12.1%
c 460609
 
0.7%
f 239969
 
0.4%
2 230641
 
0.4%
3 230303
 
0.4%
Other values (7) 23380
 
< 0.1%