Overview

Dataset statistics

Number of variables19
Number of observations25511332
Missing cells161328437
Missing cells (%)33.3%
Total size in memory3.6 GiB
Average record size in memory152.0 B

Variable types

Numeric13
Text6

Alerts

collater_valueofguarantee_1124L has 25182623 (98.7%) missing valuesMissing
collater_valueofguarantee_876L has 24501614 (96.0%) missing valuesMissing
pmts_dpd_1073P has 21490632 (84.2%) missing valuesMissing
pmts_dpd_303P has 14872146 (58.3%) missing valuesMissing
pmts_month_158T has 16723732 (65.6%) missing valuesMissing
pmts_month_706T has 2754316 (10.8%) missing valuesMissing
pmts_overdue_1140A has 21467596 (84.1%) missing valuesMissing
pmts_overdue_1152A has 14857730 (58.2%) missing valuesMissing
pmts_year_1139T has 16723732 (65.6%) missing valuesMissing
pmts_year_507T has 2754316 (10.8%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 222.4815149)Skewed
collater_valueofguarantee_876L is highly skewed (γ1 = 554.2424378)Skewed
pmts_dpd_303P is highly skewed (γ1 = 31.47683138)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 187.1643167)Skewed
pmts_overdue_1152A is highly skewed (γ1 = 2054.522814)Skewed
collater_valueofguarantee_1124L has 301773 (1.2%) zerosZeros
collater_valueofguarantee_876L has 908887 (3.6%) zerosZeros
num_group1 has 4889854 (19.2%) zerosZeros
num_group2 has 1030268 (4.0%) zerosZeros
pmts_dpd_1073P has 3756543 (14.7%) zerosZeros
pmts_dpd_303P has 8873659 (34.8%) zerosZeros
pmts_overdue_1140A has 3775346 (14.8%) zerosZeros
pmts_overdue_1152A has 8809850 (34.5%) zerosZeros

Reproduction

Analysis started2024-02-13 19:50:08.714823
Analysis finished2024-02-13 19:50:53.306508
Duration44.59 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct150426
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1442419.953
Minimum42865
Maximum2677343
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:54.495243image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum42865
5-th percentile196742
Q1943104
median1770308
Q31804494
95-th percentile2670785
Maximum2677343
Range2634478
Interquartile range (IQR)861390

Descriptive statistics

Standard deviation818041.3973
Coefficient of variation (CV)0.5671312265
Kurtosis-0.9485639771
Mean1442419.953
Median Absolute Deviation (MAD)47817
Skewness-0.3442243149
Sum3.679805431 × 1013
Variance6.691917277 × 1011
MonotonicityIncreasing
2024-02-13T20:50:54.687360image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
214097 9404
 
< 0.1%
941237 4968
 
< 0.1%
1782211 4008
 
< 0.1%
1805200 2868
 
< 0.1%
216858 2448
 
< 0.1%
1797208 2316
 
< 0.1%
212614 2292
 
< 0.1%
969281 2184
 
< 0.1%
961696 2040
 
< 0.1%
1805398 2004
 
< 0.1%
Other values (150416) 25476800
99.9%
ValueCountFrequency (%)
42865 24
 
< 0.1%
42911 108
< 0.1%
43010 72
< 0.1%
43055 168
< 0.1%
43068 48
 
< 0.1%
ValueCountFrequency (%)
2677343 24
 
< 0.1%
2677342 96
< 0.1%
2677341 84
< 0.1%
2677340 132
< 0.1%
2677339 48
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:54.867651image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters204090656
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9a0c095e
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25182623
98.7%
9a0c095e 214647
 
0.8%
8fd95e4b 113868
 
0.4%
06fb9ba8 194
 
< 0.1%
2024-02-13T20:50:55.202110image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 75876384
37.2%
a 25397464
 
12.4%
b 25296879
 
12.4%
4 25296491
 
12.4%
7 25182623
 
12.3%
1 25182623
 
12.3%
9 543356
 
0.3%
0 429488
 
0.2%
e 328515
 
0.2%
c 214647
 
0.1%
Other values (4) 342186
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 152625221
74.8%
Lowercase Letter 51465435
 
25.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 75876384
49.7%
4 25296491
 
16.6%
7 25182623
 
16.5%
1 25182623
 
16.5%
9 543356
 
0.4%
0 429488
 
0.3%
8 114062
 
0.1%
6 194
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 25397464
49.3%
b 25296879
49.2%
e 328515
 
0.6%
c 214647
 
0.4%
f 114062
 
0.2%
d 113868
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 152625221
74.8%
Latin 51465435
 
25.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 75876384
49.7%
4 25296491
 
16.6%
7 25182623
 
16.5%
1 25182623
 
16.5%
9 543356
 
0.4%
0 429488
 
0.3%
8 114062
 
0.1%
6 194
 
< 0.1%
Latin
ValueCountFrequency (%)
a 25397464
49.3%
b 25296879
49.2%
e 328515
 
0.6%
c 214647
 
0.4%
f 114062
 
0.2%
d 113868
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204090656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 75876384
37.2%
a 25397464
 
12.4%
b 25296879
 
12.4%
4 25296491
 
12.4%
7 25182623
 
12.3%
1 25182623
 
12.3%
9 543356
 
0.3%
0 429488
 
0.2%
e 328515
 
0.2%
c 214647
 
0.1%
Other values (4) 342186
 
0.2%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:55.385550image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters204090656
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 24501614
96.0%
9a0c095e 537869
 
2.1%
8fd95e4b 470331
 
1.8%
06fb9ba8 1308
 
< 0.1%
3cbe86ba 201
 
< 0.1%
9276e4bb 5
 
< 0.1%
c7a5ad39 4
 
< 0.1%
2024-02-13T20:50:55.680481image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 74513046
36.5%
a 25041000
 
12.3%
b 24974973
 
12.2%
4 24971950
 
12.2%
7 24501623
 
12.0%
1 24501614
 
12.0%
9 1547386
 
0.8%
0 1077046
 
0.5%
e 1008406
 
0.5%
c 538074
 
0.3%
Other values (6) 1415538
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 151586229
74.3%
Lowercase Letter 52504427
 
25.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 74513046
49.2%
4 24971950
 
16.5%
7 24501623
 
16.2%
1 24501614
 
16.2%
9 1547386
 
1.0%
0 1077046
 
0.7%
8 471840
 
0.3%
6 1514
 
< 0.1%
3 205
 
< 0.1%
2 5
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 25041000
47.7%
b 24974973
47.6%
e 1008406
 
1.9%
c 538074
 
1.0%
f 471639
 
0.9%
d 470335
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 151586229
74.3%
Latin 52504427
 
25.7%

Most frequent character per script

Common
ValueCountFrequency (%)
5 74513046
49.2%
4 24971950
 
16.5%
7 24501623
 
16.2%
1 24501614
 
16.2%
9 1547386
 
1.0%
0 1077046
 
0.7%
8 471840
 
0.3%
6 1514
 
< 0.1%
3 205
 
< 0.1%
2 5
 
< 0.1%
Latin
ValueCountFrequency (%)
a 25041000
47.7%
b 24974973
47.6%
e 1008406
 
1.9%
c 538074
 
1.0%
f 471639
 
0.9%
d 470335
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204090656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 74513046
36.5%
a 25041000
 
12.3%
b 24974973
 
12.2%
4 24971950
 
12.2%
7 24501623
 
12.0%
1 24501614
 
12.0%
9 1547386
 
0.8%
0 1077046
 
0.5%
e 1008406
 
0.5%
c 538074
 
0.3%
Other values (6) 1415538
 
0.7%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct14600
Distinct (%)4.4%
Missing25182623
Missing (%)98.7%
Infinite0
Infinite (%)0.0%
Mean6965820.356
Minimum0
Maximum9.878 × 1010
Zeros301773
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:55.848255image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5199840
Maximum9.878 × 1010
Range9.878 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation386553639.2
Coefficient of variation (CV)55.49290959
Kurtosis54096.52349
Mean6965820.356
Median Absolute Deviation (MAD)0
Skewness222.4815149
Sum2.289727843 × 1012
Variance1.49423716 × 1017
MonotonicityNot monotonic
2024-02-13T20:50:56.043244image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 301773
 
1.2%
1300000000 651
 
< 0.1%
3000000 156
 
< 0.1%
5000000 155
 
< 0.1%
2000000 141
 
< 0.1%
4000000 138
 
< 0.1%
10000000 112
 
< 0.1%
20000000 109
 
< 0.1%
6000000 107
 
< 0.1%
833333.35 97
 
< 0.1%
Other values (14590) 25270
 
0.1%
(Missing) 25182623
98.7%
ValueCountFrequency (%)
0 301773
1.2%
1 77
 
< 0.1%
1866 1
 
< 0.1%
2383.71 1
 
< 0.1%
2445 1
 
< 0.1%
ValueCountFrequency (%)
9.878 × 10104
< 0.1%
4.49 × 10104
< 0.1%
1.194099351 × 10103
< 0.1%
5593274000 2
< 0.1%
3719067464 1
 
< 0.1%

collater_valueofguarantee_876L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct32485
Distinct (%)3.2%
Missing24501614
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean2941478.947
Minimum0
Maximum9.878 × 1010
Zeros908887
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:56.250204image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile166511.45
Maximum9.878 × 1010
Range9.878 × 1010
Interquartile range (IQR)0

Descriptive statistics

Standard deviation123789868.3
Coefficient of variation (CV)42.0842272
Kurtosis418707.4787
Mean2941478.947
Median Absolute Deviation (MAD)0
Skewness554.2424378
Sum2.970064239 × 1012
Variance1.532393149 × 1016
MonotonicityNot monotonic
2024-02-13T20:50:56.425254image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 908887
 
3.6%
60000 2689
 
< 0.1%
130000 2138
 
< 0.1%
100000 1863
 
< 0.1%
65000 1329
 
< 0.1%
50000 1265
 
< 0.1%
70000 920
 
< 0.1%
150000 886
 
< 0.1%
300000 877
 
< 0.1%
200000 857
 
< 0.1%
Other values (32475) 88007
 
0.3%
(Missing) 24501614
96.0%
ValueCountFrequency (%)
0 908887
3.6%
0.1 1
 
< 0.1%
0.14 1
 
< 0.1%
0.24 1
 
< 0.1%
0.48 1
 
< 0.1%
ValueCountFrequency (%)
9.878 × 10101
 
< 0.1%
4.49 × 10101
 
< 0.1%
3250000000 60
< 0.1%
3200000000 17
 
< 0.1%
2921303961 4
 
< 0.1%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:56.631842image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters204090656
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 24501614
96.0%
c7a5ad39 786090
 
3.1%
3cbe86ba 148053
 
0.6%
9276e4bb 25228
 
0.1%
0e63c0f0 17717
 
0.1%
168ad9f3 7815
 
< 0.1%
5224034a 5628
 
< 0.1%
7b62420e 5296
 
< 0.1%
940efad7 4516
 
< 0.1%
2fd21cf1 3444
 
< 0.1%
Other values (5) 5931
 
< 0.1%
2024-02-13T20:50:56.935177image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 74300036
36.4%
a 26245295
 
12.9%
7 25326667
 
12.4%
b 24855889
 
12.2%
4 24554535
 
12.0%
1 24517708
 
12.0%
3 967388
 
0.5%
c 958780
 
0.5%
9 827819
 
0.4%
d 803294
 
0.4%
Other values (6) 733245
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150986755
74.0%
Lowercase Letter 53103901
 
26.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 74300036
49.2%
7 25326667
 
16.8%
4 24554535
 
16.3%
1 24517708
 
16.2%
3 967388
 
0.6%
9 827819
 
0.5%
6 206526
 
0.1%
8 158688
 
0.1%
0 71995
 
< 0.1%
2 55393
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 26245295
49.4%
b 24855889
46.8%
c 958780
 
1.8%
d 803294
 
1.5%
e 202201
 
0.4%
f 38442
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 150986755
74.0%
Latin 53103901
 
26.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 74300036
49.2%
7 25326667
 
16.8%
4 24554535
 
16.3%
1 24517708
 
16.2%
3 967388
 
0.6%
9 827819
 
0.5%
6 206526
 
0.1%
8 158688
 
0.1%
0 71995
 
< 0.1%
2 55393
 
< 0.1%
Latin
ValueCountFrequency (%)
a 26245295
49.4%
b 24855889
46.8%
c 958780
 
1.8%
d 803294
 
1.5%
e 202201
 
0.4%
f 38442
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204090656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 74300036
36.4%
a 26245295
 
12.9%
7 25326667
 
12.4%
b 24855889
 
12.2%
4 24554535
 
12.0%
1 24517708
 
12.0%
3 967388
 
0.5%
c 958780
 
0.5%
9 827819
 
0.4%
d 803294
 
0.4%
Other values (6) 733245
 
0.4%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:57.105993image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters204090656
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25182623
98.7%
c7a5ad39 287925
 
1.1%
9276e4bb 13475
 
0.1%
0e63c0f0 12582
 
< 0.1%
168ad9f3 6524
 
< 0.1%
7b62420e 3842
 
< 0.1%
3cbe86ba 1937
 
< 0.1%
940efad7 657
 
< 0.1%
f4d8a027 639
 
< 0.1%
2fd21cf1 499
 
< 0.1%
Other values (5) 629
 
< 0.1%
2024-02-13T20:50:57.456206image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 75836407
37.2%
a 25768689
 
12.6%
7 25489363
 
12.5%
b 25217486
 
12.4%
4 25202116
 
12.3%
1 25190326
 
12.3%
3 309400
 
0.2%
9 308747
 
0.2%
c 303207
 
0.1%
d 296244
 
0.1%
Other values (6) 168671
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 152450951
74.7%
Lowercase Letter 51639705
 
25.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 75836407
49.7%
7 25489363
 
16.7%
4 25202116
 
16.5%
1 25190326
 
16.5%
3 309400
 
0.2%
9 308747
 
0.2%
0 43260
 
< 0.1%
6 38557
 
< 0.1%
2 23494
 
< 0.1%
8 9281
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 25768689
49.9%
b 25217486
48.8%
c 303207
 
0.6%
d 296244
 
0.6%
e 32674
 
0.1%
f 21405
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 152450951
74.7%
Latin 51639705
 
25.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 75836407
49.7%
7 25489363
 
16.7%
4 25202116
 
16.5%
1 25190326
 
16.5%
3 309400
 
0.2%
9 308747
 
0.2%
0 43260
 
< 0.1%
6 38557
 
< 0.1%
2 23494
 
< 0.1%
8 9281
 
< 0.1%
Latin
ValueCountFrequency (%)
a 25768689
49.9%
b 25217486
48.8%
c 303207
 
0.6%
d 296244
 
0.6%
e 32674
 
0.1%
f 21405
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204090656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 75836407
37.2%
a 25768689
 
12.6%
7 25489363
 
12.5%
b 25217486
 
12.4%
4 25202116
 
12.3%
1 25190326
 
12.3%
3 309400
 
0.2%
9 308747
 
0.2%
c 303207
 
0.1%
d 296244
 
0.1%
Other values (6) 168671
 
0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct333
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.199547558
Minimum0
Maximum332
Zeros4889854
Zeros (%)19.2%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:57.611769image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q37
95-th percentile17
Maximum332
Range332
Interquartile range (IQR)6

Descriptive statistics

Standard deviation7.81430312
Coefficient of variation (CV)1.502881363
Kurtosis188.0776265
Mean5.199547558
Median Absolute Deviation (MAD)3
Skewness8.552489082
Sum132647384
Variance61.06333325
MonotonicityNot monotonic
2024-02-13T20:50:57.774958image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4889854
19.2%
1 3764450
14.8%
2 2904683
11.4%
3 2309598
9.1%
4 1880065
 
7.4%
5 1565930
 
6.1%
6 1309001
 
5.1%
7 1097447
 
4.3%
8 917453
 
3.6%
9 765332
 
3.0%
Other values (323) 4107519
16.1%
ValueCountFrequency (%)
0 4889854
19.2%
1 3764450
14.8%
2 2904683
11.4%
3 2309598
9.1%
4 1880065
 
7.4%
ValueCountFrequency (%)
332 12
< 0.1%
331 12
< 0.1%
330 12
< 0.1%
329 12
< 0.1%
328 12
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.55214197
Minimum0
Maximum100
Zeros1030268
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:57.941961image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q320
95-th percentile32
Maximum100
Range100
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.435353914
Coefficient of variation (CV)0.6962260237
Kurtosis-0.3341419037
Mean13.55214197
Median Absolute Deviation (MAD)7
Skewness0.5287536011
Sum345733193
Variance89.02590348
MonotonicityNot monotonic
2024-02-13T20:50:58.106241image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1030268
 
4.0%
7 1030177
 
4.0%
11 1030177
 
4.0%
10 1030177
 
4.0%
9 1030177
 
4.0%
8 1030177
 
4.0%
1 1030177
 
4.0%
6 1030177
 
4.0%
5 1030177
 
4.0%
4 1030177
 
4.0%
Other values (91) 15209471
59.6%
ValueCountFrequency (%)
0 1030268
4.0%
1 1030177
4.0%
2 1030177
4.0%
3 1030177
4.0%
4 1030177
4.0%
ValueCountFrequency (%)
100 1
 
< 0.1%
99 1
 
< 0.1%
98 93
< 0.1%
97 94
< 0.1%
96 95
< 0.1%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3814
Distinct (%)0.1%
Missing21490632
Missing (%)84.2%
Infinite0
Infinite (%)0.0%
Mean13.55760813
Minimum0
Maximum4565
Zeros3756543
Zeros (%)14.7%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:58.265191image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum4565
Range4565
Interquartile range (IQR)0

Descriptive statistics

Standard deviation142.5610536
Coefficient of variation (CV)10.51520683
Kurtosis314.7134632
Mean13.55760813
Median Absolute Deviation (MAD)0
Skewness16.02003956
Sum54511075
Variance20323.65399
MonotonicityNot monotonic
2024-02-13T20:50:58.425191image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3756543
 
14.7%
1 40686
 
0.2%
2 15994
 
0.1%
3 14561
 
0.1%
4 13014
 
0.1%
7 8012
 
< 0.1%
5 7668
 
< 0.1%
6 6989
 
< 0.1%
8 6008
 
< 0.1%
9 5975
 
< 0.1%
Other values (3804) 145250
 
0.6%
(Missing) 21490632
84.2%
ValueCountFrequency (%)
0 3756543
14.7%
1 40686
 
0.2%
2 15994
 
0.1%
3 14561
 
0.1%
4 13014
 
0.1%
ValueCountFrequency (%)
4565 1
< 0.1%
4561 1
< 0.1%
4552 1
< 0.1%
4536 2
< 0.1%
4503 1
< 0.1%

pmts_dpd_303P
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct4273
Distinct (%)< 0.1%
Missing14872146
Missing (%)58.3%
Infinite0
Infinite (%)0.0%
Mean58.91003541
Minimum-18
Maximum117000
Zeros8873659
Zeros (%)34.8%
Negative1621
Negative (%)< 0.1%
Memory size194.6 MiB
2024-02-13T20:50:58.573601image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-18
5-th percentile0
Q10
median0
Q30
95-th percentile377
Maximum117000
Range117018
Interquartile range (IQR)0

Descriptive statistics

Standard deviation285.3191101
Coefficient of variation (CV)4.843302303
Kurtosis8206.222245
Mean58.91003541
Median Absolute Deviation (MAD)0
Skewness31.47683138
Sum626754824
Variance81406.99462
MonotonicityNot monotonic
2024-02-13T20:50:58.735432image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8873659
34.8%
1 243256
 
1.0%
3 64361
 
0.3%
2 61391
 
0.2%
4 53185
 
0.2%
6 43772
 
0.2%
5 35500
 
0.1%
7 34653
 
0.1%
9 25102
 
0.1%
8 24613
 
0.1%
Other values (4263) 1179694
 
4.6%
(Missing) 14872146
58.3%
ValueCountFrequency (%)
-18 11
< 0.1%
-16 2
 
< 0.1%
-15 2
 
< 0.1%
-14 2
 
< 0.1%
-12 1
 
< 0.1%
ValueCountFrequency (%)
117000 1
< 0.1%
84575 2
< 0.1%
84574 1
< 0.1%
84573 1
< 0.1%
84560 2
< 0.1%

pmts_month_158T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing16723732
Missing (%)65.6%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:58.868424image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052726
Coefficient of variation (CV)0.5310850348
Kurtosis-1.216783226
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum57119400
Variance11.91666802
MonotonicityNot monotonic
2024-02-13T20:50:58.988873image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 732300
 
2.9%
3 732300
 
2.9%
4 732300
 
2.9%
5 732300
 
2.9%
6 732300
 
2.9%
7 732300
 
2.9%
8 732300
 
2.9%
9 732300
 
2.9%
10 732300
 
2.9%
11 732300
 
2.9%
Other values (2) 1464600
 
5.7%
(Missing) 16723732
65.6%
ValueCountFrequency (%)
1 732300
2.9%
2 732300
2.9%
3 732300
2.9%
4 732300
2.9%
5 732300
2.9%
ValueCountFrequency (%)
12 732300
2.9%
11 732300
2.9%
10 732300
2.9%
9 732300
2.9%
8 732300
2.9%

pmts_month_706T
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing2754316
Missing (%)10.8%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:59.108011image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052605
Coefficient of variation (CV)0.5310850162
Kurtosis-1.21678322
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum147920604
Variance11.91666719
MonotonicityNot monotonic
2024-02-13T20:50:59.226067image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 1896418
7.4%
3 1896418
7.4%
4 1896418
7.4%
5 1896418
7.4%
6 1896418
7.4%
7 1896418
7.4%
8 1896418
7.4%
9 1896418
7.4%
10 1896418
7.4%
11 1896418
7.4%
Other values (2) 3792836
14.9%
(Missing) 2754316
10.8%
ValueCountFrequency (%)
1 1896418
7.4%
2 1896418
7.4%
3 1896418
7.4%
4 1896418
7.4%
5 1896418
7.4%
ValueCountFrequency (%)
12 1896418
7.4%
11 1896418
7.4%
10 1896418
7.4%
9 1896418
7.4%
8 1896418
7.4%

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct188405
Distinct (%)4.7%
Missing21467596
Missing (%)84.1%
Infinite0
Infinite (%)0.0%
Mean2147.679317
Minimum0
Maximum28325886
Zeros3775346
Zeros (%)14.8%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:59.382667image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1774.0335
Maximum28325886
Range28325886
Interquartile range (IQR)0

Descriptive statistics

Standard deviation106264.0546
Coefficient of variation (CV)49.47854821
Kurtosis41865.39734
Mean2147.679317
Median Absolute Deviation (MAD)0
Skewness187.1643167
Sum8684648170
Variance1.12920493 × 1010
MonotonicityNot monotonic
2024-02-13T20:50:59.537832image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3775346
 
14.8%
10 386
 
< 0.1%
400 298
 
< 0.1%
1000 292
 
< 0.1%
14 248
 
< 0.1%
35000 222
 
< 0.1%
4228 204
 
< 0.1%
3500 188
 
< 0.1%
2000 178
 
< 0.1%
2 176
 
< 0.1%
Other values (188395) 266198
 
1.0%
(Missing) 21467596
84.1%
ValueCountFrequency (%)
0 3775346
14.8%
0.002 21
 
< 0.1%
0.004 12
 
< 0.1%
0.006 9
 
< 0.1%
0.008 38
 
< 0.1%
ValueCountFrequency (%)
28325886 23
< 0.1%
23891848 10
< 0.1%
17402200 14
< 0.1%
15768560 10
< 0.1%
15237162 2
 
< 0.1%

pmts_overdue_1152A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct759663
Distinct (%)7.1%
Missing14857730
Missing (%)58.2%
Infinite0
Infinite (%)0.0%
Mean4486.441981
Minimum0
Maximum593465000
Zeros8809850
Zeros (%)34.5%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:59.698092image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile17498.943
Maximum593465000
Range593465000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation215393.7122
Coefficient of variation (CV)48.00991813
Kurtosis5451623.858
Mean4486.441981
Median Absolute Deviation (MAD)0
Skewness2054.522814
Sum4.779676726 × 1010
Variance4.639445125 × 1010
MonotonicityNot monotonic
2024-02-13T20:50:59.856723image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8809850
34.5%
0.2 5294
 
< 0.1%
1000 3491
 
< 0.1%
2000 2400
 
< 0.1%
0.4 2210
 
< 0.1%
3000 2032
 
< 0.1%
0.8 1675
 
< 0.1%
2 1669
 
< 0.1%
0.6 1664
 
< 0.1%
1.2 1592
 
< 0.1%
Other values (759653) 1821725
 
7.1%
(Missing) 14857730
58.2%
ValueCountFrequency (%)
0 8809850
34.5%
0.002 149
 
< 0.1%
0.004 72
 
< 0.1%
0.006 52
 
< 0.1%
0.008 76
 
< 0.1%
ValueCountFrequency (%)
593465000 1
 
< 0.1%
107822220 7
< 0.1%
52067108 1
 
< 0.1%
46745056 1
 
< 0.1%
38199708 1
 
< 0.1%

pmts_year_1139T
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)< 0.1%
Missing16723732
Missing (%)65.6%
Infinite0
Infinite (%)0.0%
Mean2019.308068
Minimum2015
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:50:59.979842image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2018
Q12019
median2019
Q32020
95-th percentile2020
Maximum2021
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7898625601
Coefficient of variation (CV)0.0003911550559
Kurtosis-0.7153131616
Mean2019.308068
Median Absolute Deviation (MAD)1
Skewness-0.1929942781
Sum1.774487158 × 1010
Variance0.6238828638
MonotonicityNot monotonic
2024-02-13T20:51:00.093836image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2020 3605350
 
14.1%
2019 3384736
 
13.3%
2018 1493543
 
5.9%
2021 300908
 
1.2%
2017 2764
 
< 0.1%
2016 277
 
< 0.1%
2015 22
 
< 0.1%
(Missing) 16723732
65.6%
ValueCountFrequency (%)
2015 22
 
< 0.1%
2016 277
 
< 0.1%
2017 2764
 
< 0.1%
2018 1493543
5.9%
2019 3384736
13.3%
ValueCountFrequency (%)
2021 300908
 
1.2%
2020 3605350
14.1%
2019 3384736
13.3%
2018 1493543
5.9%
2017 2764
 
< 0.1%

pmts_year_507T
Real number (ℝ)

MISSING 

Distinct22
Distinct (%)< 0.1%
Missing2754316
Missing (%)10.8%
Infinite0
Infinite (%)0.0%
Mean2014.693585
Minimum2000
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size194.6 MiB
2024-02-13T20:51:00.216759image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2007
Q12012
median2016
Q32018
95-th percentile2019
Maximum2021
Range21
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.94224462
Coefficient of variation (CV)0.001956746499
Kurtosis-0.458399037
Mean2014.693585
Median Absolute Deviation (MAD)2
Skewness-0.766296528
Sum4.584841416 × 1010
Variance15.54129265
MonotonicityNot monotonic
2024-02-13T20:51:00.346365image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2018 3565601
14.0%
2019 3053031
12.0%
2017 2780225
10.9%
2016 1980636
7.8%
2015 1738516
6.8%
2014 1595306
6.3%
2013 1375124
 
5.4%
2012 1116822
 
4.4%
2011 963495
 
3.8%
2007 813761
 
3.2%
Other values (12) 3774499
14.8%
(Missing) 2754316
10.8%
ValueCountFrequency (%)
2000 11
 
< 0.1%
2001 111
 
< 0.1%
2002 571
 
< 0.1%
2003 1591
 
< 0.1%
2004 48122
0.2%
ValueCountFrequency (%)
2021 26130
 
0.1%
2020 537474
 
2.1%
2019 3053031
12.0%
2018 3565601
14.0%
2017 2780225
10.9%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:51:00.491321image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters204090656
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 24511526
96.1%
ab3c25cf 981325
 
3.8%
15f04f45 10252
 
< 0.1%
be4fd70b 5316
 
< 0.1%
daf49a8a 2853
 
< 0.1%
71ddaa88 21
 
< 0.1%
0c42a10e 18
 
< 0.1%
1d94eac1 16
 
< 0.1%
9ba4314a 5
 
< 0.1%
2024-02-13T20:51:00.775067image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 74536407
36.5%
b 25503488
 
12.5%
a 25501496
 
12.5%
4 24540243
 
12.0%
1 24521854
 
12.0%
7 24516863
 
12.0%
c 1962684
 
1.0%
f 1009998
 
0.5%
2 981343
 
0.5%
3 981330
 
0.5%
Other values (5) 34950
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150099413
73.5%
Lowercase Letter 53991243
 
26.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 74536407
49.7%
4 24540243
 
16.3%
1 24521854
 
16.3%
7 24516863
 
16.3%
2 981343
 
0.7%
3 981330
 
0.7%
0 15604
 
< 0.1%
8 2895
 
< 0.1%
9 2874
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 25503488
47.2%
a 25501496
47.2%
c 1962684
 
3.6%
f 1009998
 
1.9%
d 8227
 
< 0.1%
e 5350
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 150099413
73.5%
Latin 53991243
 
26.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 74536407
49.7%
4 24540243
 
16.3%
1 24521854
 
16.3%
7 24516863
 
16.3%
2 981343
 
0.7%
3 981330
 
0.7%
0 15604
 
< 0.1%
8 2895
 
< 0.1%
9 2874
 
< 0.1%
Latin
ValueCountFrequency (%)
b 25503488
47.2%
a 25501496
47.2%
c 1962684
 
3.6%
f 1009998
 
1.9%
d 8227
 
< 0.1%
e 5350
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204090656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 74536407
36.5%
b 25503488
 
12.5%
a 25501496
 
12.5%
4 24540243
 
12.0%
1 24521854
 
12.0%
7 24516863
 
12.0%
c 1962684
 
1.0%
f 1009998
 
0.5%
2 981343
 
0.5%
3 981330
 
0.5%
Other values (5) 34950
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size194.6 MiB
2024-02-13T20:51:00.945928image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000014111
Min length8

Characters and Unicode

Total characters204091016
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 25197016
98.8%
ab3c25cf 306490
 
1.2%
be4fd70b 3129
 
< 0.1%
15f04f45 2501
 
< 0.1%
daf49a8a 1831
 
< 0.1%
p28_48_88 360
 
< 0.1%
71ddaa88 3
 
< 0.1%
0c42a10e 2
 
< 0.1%
2024-02-13T20:51:01.228548image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 75902540
37.2%
b 25509764
 
12.5%
a 25509007
 
12.5%
4 25207340
 
12.4%
7 25200148
 
12.3%
1 25199522
 
12.3%
c 612982
 
0.3%
f 316452
 
0.2%
2 306852
 
0.2%
3 306490
 
0.2%
Other values (7) 19919
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 152133634
74.5%
Lowercase Letter 51956302
 
25.5%
Connector Punctuation 720
 
< 0.1%
Uppercase Letter 360
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 75902540
49.9%
4 25207340
 
16.6%
7 25200148
 
16.6%
1 25199522
 
16.6%
2 306852
 
0.2%
3 306490
 
0.2%
0 5634
 
< 0.1%
8 3277
 
< 0.1%
9 1831
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 25509764
49.1%
a 25509007
49.1%
c 612982
 
1.2%
f 316452
 
0.6%
d 4966
 
< 0.1%
e 3131
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 720
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 360
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 152134354
74.5%
Latin 51956662
 
25.5%

Most frequent character per script

Common
ValueCountFrequency (%)
5 75902540
49.9%
4 25207340
 
16.6%
7 25200148
 
16.6%
1 25199522
 
16.6%
2 306852
 
0.2%
3 306490
 
0.2%
0 5634
 
< 0.1%
8 3277
 
< 0.1%
9 1831
 
< 0.1%
_ 720
 
< 0.1%
Latin
ValueCountFrequency (%)
b 25509764
49.1%
a 25509007
49.1%
c 612982
 
1.2%
f 316452
 
0.6%
d 4966
 
< 0.1%
e 3131
 
< 0.1%
P 360
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 204091016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 75902540
37.2%
b 25509764
 
12.5%
a 25509007
 
12.5%
4 25207340
 
12.4%
7 25200148
 
12.3%
1 25199522
 
12.3%
c 612982
 
0.3%
f 316452
 
0.2%
2 306852
 
0.2%
3 306490
 
0.2%
Other values (7) 19919
 
< 0.1%