Overview

Dataset statistics

Number of variables19
Number of observations5296031
Missing cells37184092
Missing cells (%)37.0%
Total size in memory767.7 MiB
Average record size in memory152.0 B

Variable types

Numeric8
Text6
Unsupported5

Alerts

collater_typofvalofguarant_407M has constant value ""Constant
collaterals_typeofguarante_359M has constant value ""Constant
subjectroles_name_541M has constant value ""Constant
collater_valueofguarantee_1124L has 5093116 (96.2%) missing valuesMissing
collater_valueofguarantee_876L has 5296031 (100.0%) missing valuesMissing
pmts_dpd_1073P has 2806418 (53.0%) missing valuesMissing
pmts_dpd_303P has 5296031 (100.0%) missing valuesMissing
pmts_month_706T has 5296031 (100.0%) missing valuesMissing
pmts_overdue_1140A has 2804357 (53.0%) missing valuesMissing
pmts_overdue_1152A has 5296031 (100.0%) missing valuesMissing
pmts_year_507T has 5296031 (100.0%) missing valuesMissing
collater_valueofguarantee_1124L is highly skewed (γ1 = 53.73476843)Skewed
pmts_overdue_1140A is highly skewed (γ1 = 202.1253289)Skewed
collater_valueofguarantee_876L is an unsupported type, check if it needs cleaning or further analysisUnsupported
pmts_dpd_303P is an unsupported type, check if it needs cleaning or further analysisUnsupported
pmts_month_706T is an unsupported type, check if it needs cleaning or further analysisUnsupported
pmts_overdue_1152A is an unsupported type, check if it needs cleaning or further analysisUnsupported
pmts_year_507T is an unsupported type, check if it needs cleaning or further analysisUnsupported
collater_valueofguarantee_1124L has 188430 (3.6%) zerosZeros
num_group1 has 2915256 (55.0%) zerosZeros
num_group2 has 197095 (3.7%) zerosZeros
pmts_dpd_1073P has 2335900 (44.1%) zerosZeros
pmts_overdue_1140A has 2335817 (44.1%) zerosZeros

Reproduction

Analysis started2024-02-13 19:43:25.185582
Analysis finished2024-02-13 19:43:40.132818
Duration14.95 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct98303
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1063781.521
Minimum388
Maximum2548729
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:40.228819image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum388
5-th percentile105340
Q1622237
median1259890
Q31284811
95-th percentile2543489
Maximum2548729
Range2548341
Interquartile range (IQR)662574

Descriptive statistics

Standard deviation662226.6264
Coefficient of variation (CV)0.6225212726
Kurtosis0.31816212
Mean1063781.521
Median Absolute Deviation (MAD)38518
Skewness0.557298106
Sum5.633819914 × 1012
Variance4.385441047 × 1011
MonotonicityIncreasing
2024-02-13T20:43:40.448851image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
617817 1131
 
< 0.1%
624267 732
 
< 0.1%
641873 552
 
< 0.1%
1280884 516
 
< 0.1%
1286727 348
 
< 0.1%
1295729 336
 
< 0.1%
1293107 324
 
< 0.1%
1293089 312
 
< 0.1%
1290178 300
 
< 0.1%
638376 300
 
< 0.1%
Other values (98293) 5291180
99.9%
ValueCountFrequency (%)
388 36
 
< 0.1%
405 60
< 0.1%
409 48
 
< 0.1%
410 24
 
< 0.1%
411 120
< 0.1%
ValueCountFrequency (%)
2548729 108
< 0.1%
2548728 24
 
< 0.1%
2548727 36
 
< 0.1%
2548726 96
< 0.1%
2548725 36
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:40.625782image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters42368248
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8fd95e4b
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5093116
96.2%
9a0c095e 138021
 
2.6%
8fd95e4b 64740
 
1.2%
06fb9ba8 154
 
< 0.1%
2024-02-13T20:43:40.910794image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15482109
36.5%
a 5231291
 
12.3%
b 5158164
 
12.2%
4 5157856
 
12.2%
7 5093116
 
12.0%
1 5093116
 
12.0%
9 340936
 
0.8%
0 276196
 
0.7%
e 202761
 
0.5%
c 138021
 
0.3%
Other values (4) 194682
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31508377
74.4%
Lowercase Letter 10859871
 
25.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15482109
49.1%
4 5157856
 
16.4%
7 5093116
 
16.2%
1 5093116
 
16.2%
9 340936
 
1.1%
0 276196
 
0.9%
8 64894
 
0.2%
6 154
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 5231291
48.2%
b 5158164
47.5%
e 202761
 
1.9%
c 138021
 
1.3%
f 64894
 
0.6%
d 64740
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 31508377
74.4%
Latin 10859871
 
25.6%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15482109
49.1%
4 5157856
 
16.4%
7 5093116
 
16.2%
1 5093116
 
16.2%
9 340936
 
1.1%
0 276196
 
0.9%
8 64894
 
0.2%
6 154
 
< 0.1%
Latin
ValueCountFrequency (%)
a 5231291
48.2%
b 5158164
47.5%
e 202761
 
1.9%
c 138021
 
1.3%
f 64894
 
0.6%
d 64740
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15482109
36.5%
a 5231291
 
12.3%
b 5158164
 
12.2%
4 5157856
 
12.2%
7 5093116
 
12.0%
1 5093116
 
12.0%
9 340936
 
0.8%
0 276196
 
0.7%
e 202761
 
0.5%
c 138021
 
0.3%
Other values (4) 194682
 
0.5%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:41.063372image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters42368248
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5296031
100.0%
2024-02-13T20:43:41.337756image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31776186
75.0%
Lowercase Letter 10592062
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Lowercase Letter
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31776186
75.0%
Latin 10592062
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Latin
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%

collater_valueofguarantee_1124L
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct9526
Distinct (%)4.7%
Missing5093116
Missing (%)96.2%
Infinite0
Infinite (%)0.0%
Mean1863340.599
Minimum0
Maximum3200000000
Zeros188430
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:41.515340image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3712142.8
Maximum3200000000
Range3200000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation33559092.64
Coefficient of variation (CV)18.01017627
Kurtosis3996.973233
Mean1863340.599
Median Absolute Deviation (MAD)0
Skewness53.73476843
Sum3.780997577 × 1011
Variance1.126212699 × 1015
MonotonicityNot monotonic
2024-02-13T20:43:41.674428image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 188430
 
3.6%
100000000 163
 
< 0.1%
200000000 138
 
< 0.1%
800000000 132
 
< 0.1%
3000000 98
 
< 0.1%
2000000 84
 
< 0.1%
4000000 82
 
< 0.1%
1 75
 
< 0.1%
5000000 63
 
< 0.1%
6000000 58
 
< 0.1%
Other values (9516) 13592
 
0.3%
(Missing) 5093116
96.2%
ValueCountFrequency (%)
0 188430
3.6%
0.9 2
 
< 0.1%
1 75
 
< 0.1%
100 3
 
< 0.1%
230 1
 
< 0.1%
ValueCountFrequency (%)
3200000000 6
< 0.1%
2500000000 8
< 0.1%
2100690411 1
 
< 0.1%
1908326000 1
 
< 0.1%
932264044.2 1
 
< 0.1%

collater_valueofguarantee_876L
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5296031
Missing (%)100.0%
Memory size40.4 MiB
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:41.835100image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters42368248
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5296031
100.0%
2024-02-13T20:43:42.108268image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31776186
75.0%
Lowercase Letter 10592062
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Lowercase Letter
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31776186
75.0%
Latin 10592062
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Latin
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:42.283412image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters42368248
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowc7a5ad39
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5093116
96.2%
c7a5ad39 181173
 
3.4%
9276e4bb 8128
 
0.2%
0e63c0f0 7920
 
0.1%
168ad9f3 1877
 
< 0.1%
7b62420e 1839
 
< 0.1%
940efad7 539
 
< 0.1%
f4d8a027 512
 
< 0.1%
3cbe86ba 343
 
< 0.1%
2fd21cf1 215
 
< 0.1%
Other values (5) 369
 
< 0.1%
2024-02-13T20:43:42.606945image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15460689
36.5%
a 5459276
 
12.9%
7 5285535
 
12.5%
b 5112124
 
12.1%
4 5104619
 
12.0%
1 5095449
 
12.0%
9 191751
 
0.5%
3 191455
 
0.5%
c 189694
 
0.4%
d 184316
 
0.4%
Other values (6) 93340
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31392764
74.1%
Lowercase Letter 10975484
 
25.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15460689
49.2%
7 5285535
 
16.8%
4 5104619
 
16.3%
1 5095449
 
16.2%
9 191751
 
0.6%
3 191455
 
0.6%
0 27176
 
0.1%
6 20334
 
0.1%
2 12998
 
< 0.1%
8 2758
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 5459276
49.7%
b 5112124
46.6%
c 189694
 
1.7%
d 184316
 
1.7%
e 18795
 
0.2%
f 11279
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 31392764
74.1%
Latin 10975484
 
25.9%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15460689
49.2%
7 5285535
 
16.8%
4 5104619
 
16.3%
1 5095449
 
16.2%
9 191751
 
0.6%
3 191455
 
0.6%
0 27176
 
0.1%
6 20334
 
0.1%
2 12998
 
< 0.1%
8 2758
 
< 0.1%
Latin
ValueCountFrequency (%)
a 5459276
49.7%
b 5112124
46.6%
c 189694
 
1.7%
d 184316
 
1.7%
e 18795
 
0.2%
f 11279
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15460689
36.5%
a 5459276
 
12.9%
7 5285535
 
12.5%
b 5112124
 
12.1%
4 5104619
 
12.0%
1 5095449
 
12.0%
9 191751
 
0.5%
3 191455
 
0.5%
c 189694
 
0.4%
d 184316
 
0.4%
Other values (6) 93340
 
0.2%

num_group1
Real number (ℝ)

ZEROS 

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7214553691
Minimum0
Maximum46
Zeros2915256
Zeros (%)55.0%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:42.761011image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum46
Range46
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.102630299
Coefficient of variation (CV)1.528341663
Kurtosis144.7256567
Mean0.7214553691
Median Absolute Deviation (MAD)0
Skewness6.070452508
Sum3820850
Variance1.215793576
MonotonicityNot monotonic
2024-02-13T20:43:42.913931image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0 2915256
55.0%
1 1459928
27.6%
2 600132
 
11.3%
3 214200
 
4.0%
4 68772
 
1.3%
5 22980
 
0.4%
6 8016
 
0.2%
7 2784
 
0.1%
8 1272
 
< 0.1%
9 552
 
< 0.1%
Other values (37) 2139
 
< 0.1%
ValueCountFrequency (%)
0 2915256
55.0%
1 1459928
27.6%
2 600132
 
11.3%
3 214200
 
4.0%
4 68772
 
1.3%
ValueCountFrequency (%)
46 15
< 0.1%
45 24
< 0.1%
44 24
< 0.1%
43 24
< 0.1%
42 24
< 0.1%

num_group2
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.02342263
Minimum0
Maximum35
Zeros197095
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:43.060564image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median13
Q321
95-th percentile32
Maximum35
Range35
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.312802598
Coefficient of variation (CV)0.6640891343
Kurtosis-0.7376050779
Mean14.02342263
Median Absolute Deviation (MAD)7
Skewness0.3982469587
Sum74268481
Variance86.72829223
MonotonicityNot monotonic
2024-02-13T20:43:43.203529image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 197095
 
3.7%
6 197095
 
3.7%
11 197095
 
3.7%
10 197095
 
3.7%
8 197095
 
3.7%
7 197095
 
3.7%
9 197095
 
3.7%
5 197095
 
3.7%
4 197095
 
3.7%
3 197095
 
3.7%
Other values (26) 3325081
62.8%
ValueCountFrequency (%)
0 197095
3.7%
1 197095
3.7%
2 197095
3.7%
3 197095
3.7%
4 197095
3.7%
ValueCountFrequency (%)
35 69233
1.3%
34 69233
1.3%
33 69233
1.3%
32 69233
1.3%
31 69234
1.3%

pmts_dpd_1073P
Real number (ℝ)

MISSING  ZEROS 

Distinct3358
Distinct (%)0.1%
Missing2806418
Missing (%)53.0%
Infinite0
Infinite (%)0.0%
Mean13.1621742
Minimum0
Maximum4877
Zeros2335900
Zeros (%)44.1%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:43.359565image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum4877
Range4877
Interquartile range (IQR)0

Descriptive statistics

Standard deviation138.7310755
Coefficient of variation (CV)10.54013367
Kurtosis295.5590194
Mean13.1621742
Median Absolute Deviation (MAD)0
Skewness15.45964878
Sum32768720
Variance19246.31131
MonotonicityNot monotonic
2024-02-13T20:43:43.519565image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2335900
44.1%
1 20888
 
0.4%
3 10573
 
0.2%
2 9554
 
0.2%
4 9295
 
0.2%
7 5001
 
0.1%
5 4931
 
0.1%
6 4842
 
0.1%
8 3676
 
0.1%
9 3575
 
0.1%
Other values (3348) 81378
 
1.5%
(Missing) 2806418
53.0%
ValueCountFrequency (%)
0 2335900
44.1%
1 20888
 
0.4%
2 9554
 
0.2%
3 10573
 
0.2%
4 9295
 
0.2%
ValueCountFrequency (%)
4877 1
< 0.1%
4850 1
< 0.1%
4823 1
< 0.1%
4806 1
< 0.1%
4773 1
< 0.1%

pmts_dpd_303P
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5296031
Missing (%)100.0%
Memory size40.4 MiB

pmts_month_158T
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing23
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:43.677000image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.452052855
Coefficient of variation (CV)0.5310850547
Kurtosis-1.216783233
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum34424052
Variance11.91666892
MonotonicityNot monotonic
2024-02-13T20:43:43.795038image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2 441334
8.3%
3 441334
8.3%
4 441334
8.3%
5 441334
8.3%
6 441334
8.3%
7 441334
8.3%
8 441334
8.3%
9 441334
8.3%
10 441334
8.3%
11 441334
8.3%
Other values (2) 882668
16.7%
ValueCountFrequency (%)
1 441334
8.3%
2 441334
8.3%
3 441334
8.3%
4 441334
8.3%
5 441334
8.3%
ValueCountFrequency (%)
12 441334
8.3%
11 441334
8.3%
10 441334
8.3%
9 441334
8.3%
8 441334
8.3%

pmts_month_706T
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5296031
Missing (%)100.0%
Memory size40.4 MiB

pmts_overdue_1140A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct112241
Distinct (%)4.5%
Missing2804357
Missing (%)53.0%
Infinite0
Infinite (%)0.0%
Mean1438.95848
Minimum0
Maximum15237162
Zeros2335817
Zeros (%)44.1%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:43.931036image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1012.3793
Maximum15237162
Range15237162
Interquartile range (IQR)0

Descriptive statistics

Standard deviation39508.26943
Coefficient of variation (CV)27.45615664
Kurtosis65542.76491
Mean1438.95848
Median Absolute Deviation (MAD)0
Skewness202.1253289
Sum3585415432
Variance1560903353
MonotonicityNot monotonic
2024-02-13T20:43:44.094024image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2335817
44.1%
1000 334
 
< 0.1%
2000 232
 
< 0.1%
3000 209
 
< 0.1%
400 189
 
< 0.1%
2600 164
 
< 0.1%
0.8 144
 
< 0.1%
2 140
 
< 0.1%
800 128
 
< 0.1%
0.4 124
 
< 0.1%
Other values (112231) 154193
 
2.9%
(Missing) 2804357
53.0%
ValueCountFrequency (%)
0 2335817
44.1%
0.002 22
 
< 0.1%
0.004 3
 
< 0.1%
0.006 8
 
< 0.1%
0.008 13
 
< 0.1%
ValueCountFrequency (%)
15237162 2
< 0.1%
15233466 4
< 0.1%
14662000 1
 
< 0.1%
7399121.5 1
 
< 0.1%
7008248 1
 
< 0.1%

pmts_overdue_1152A
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5296031
Missing (%)100.0%
Memory size40.4 MiB

pmts_year_1139T
Real number (ℝ)

Distinct6
Distinct (%)< 0.1%
Missing23
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2018.264301
Minimum2015
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:44.227212image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2017
Q12018
median2018
Q32019
95-th percentile2019
Maximum2020
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7832686924
Coefficient of variation (CV)0.0003880902477
Kurtosis-0.7007227773
Mean2018.264301
Median Absolute Deviation (MAD)1
Skewness-0.1184982258
Sum1.068874388 × 1010
Variance0.6135098445
MonotonicityNot monotonic
2024-02-13T20:43:44.340262image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2018 2181653
41.2%
2019 2010214
38.0%
2017 936478
17.7%
2020 165420
 
3.1%
2016 1891
 
< 0.1%
2015 352
 
< 0.1%
(Missing) 23
 
< 0.1%
ValueCountFrequency (%)
2015 352
 
< 0.1%
2016 1891
 
< 0.1%
2017 936478
17.7%
2018 2181653
41.2%
2019 2010214
38.0%
ValueCountFrequency (%)
2020 165420
 
3.1%
2019 2010214
38.0%
2018 2181653
41.2%
2017 936478
17.7%
2016 1891
 
< 0.1%

pmts_year_507T
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5296031
Missing (%)100.0%
Memory size40.4 MiB

subjectroles_name_541M
Text

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:44.483149image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters42368248
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5296031
100.0%
2024-02-13T20:43:44.758842image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31776186
75.0%
Lowercase Letter 10592062
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Lowercase Letter
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31776186
75.0%
Latin 10592062
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15888093
50.0%
4 5296031
 
16.7%
7 5296031
 
16.7%
1 5296031
 
16.7%
Latin
ValueCountFrequency (%)
a 5296031
50.0%
b 5296031
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15888093
37.5%
a 5296031
 
12.5%
4 5296031
 
12.5%
7 5296031
 
12.5%
b 5296031
 
12.5%
1 5296031
 
12.5%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.4 MiB
2024-02-13T20:43:44.927901image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000058157
Min length8

Characters and Unicode

Total characters42368556
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowab3c25cf
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 5097671
96.3%
ab3c25cf 190737
 
3.6%
be4fd70b 4749
 
0.1%
15f04f45 1313
 
< 0.1%
daf49a8a 1251
 
< 0.1%
p28_48_88 308
 
< 0.1%
71ddaa88 2
 
< 0.1%
2024-02-13T20:43:45.224945image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 15486376
36.6%
b 5297906
 
12.5%
a 5292165
 
12.5%
4 5106605
 
12.1%
7 5102422
 
12.0%
1 5098986
 
12.0%
c 381474
 
0.9%
f 199363
 
0.5%
2 191045
 
0.5%
3 190737
 
0.5%
Other values (7) 21477
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31185971
73.6%
Lowercase Letter 11181661
 
26.4%
Connector Punctuation 616
 
< 0.1%
Uppercase Letter 308
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 15486376
49.7%
4 5106605
 
16.4%
7 5102422
 
16.4%
1 5098986
 
16.4%
2 191045
 
0.6%
3 190737
 
0.6%
0 6062
 
< 0.1%
8 2487
 
< 0.1%
9 1251
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 5297906
47.4%
a 5292165
47.3%
c 381474
 
3.4%
f 199363
 
1.8%
d 6004
 
0.1%
e 4749
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 616
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 308
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31186587
73.6%
Latin 11181969
 
26.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 15486376
49.7%
4 5106605
 
16.4%
7 5102422
 
16.4%
1 5098986
 
16.3%
2 191045
 
0.6%
3 190737
 
0.6%
0 6062
 
< 0.1%
8 2487
 
< 0.1%
9 1251
 
< 0.1%
_ 616
 
< 0.1%
Latin
ValueCountFrequency (%)
b 5297906
47.4%
a 5292165
47.3%
c 381474
 
3.4%
f 199363
 
1.8%
d 6004
 
0.1%
e 4749
 
< 0.1%
P 308
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42368556
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 15486376
36.6%
b 5297906
 
12.5%
a 5292165
 
12.5%
4 5106605
 
12.1%
7 5102422
 
12.0%
1 5098986
 
12.0%
c 381474
 
0.9%
f 199363
 
0.5%
2 191045
 
0.5%
3 190737
 
0.5%
Other values (7) 21477
 
0.1%