Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 1286755 |
Missing cells | 10722 |
Missing cells (%) | 0.1% |
Total size in memory | 58.9 MiB |
Average record size in memory | 48.0 B |
Variable types
Numeric | 5 |
---|---|
Text | 1 |
pmts_dpdvalue_108P is highly skewed (γ1 = 217.5514208) | Skewed |
pmts_pmtsoverdue_635A is highly skewed (γ1 = 318.3365368) | Skewed |
num_group1 has 723766 (56.2%) zeros | Zeros |
num_group2 has 81889 (6.4%) zeros | Zeros |
pmts_dpdvalue_108P has 1137032 (88.4%) zeros | Zeros |
pmts_pmtsoverdue_635A has 1136910 (88.4%) zeros | Zeros |
Reproduction
Analysis started | 2024-02-13 19:53:21.322245 |
---|---|
Analysis finished | 2024-02-13 19:53:22.209261 |
Duration | 0.89 seconds |
Software version | ydata-profiling vv4.6.4 |
Download configuration | config.json |
case_id
Real number (ℝ)
Distinct | 36447 |
---|---|
Distinct (%) | 2.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1229443.91 |
Minimum | 467 |
---|---|
Maximum | 2703436 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.8 MiB |
Quantile statistics
Minimum | 467 |
---|---|
5-th percentile | 127544 |
Q1 | 741898 |
median | 1416105 |
Q3 | 1781534 |
95-th percentile | 1939410 |
Maximum | 2703436 |
Range | 2702969 |
Interquartile range (IQR) | 1039636 |
Descriptive statistics
Standard deviation | 679992.3043 |
---|---|
Coefficient of variation (CV) | 0.5530893267 |
Kurtosis | -0.8850089794 |
Mean | 1229443.91 |
Median Absolute Deviation (MAD) | 462324 |
Skewness | -0.2751192657 |
Sum | 1.581993098 × 1012 |
Variance | 4.62389534 × 1011 |
Monotonicity | Increasing |
Value | Count | Frequency (%) |
1828337 | 301 | < 0.1% |
938117 | 297 | < 0.1% |
1690585 | 243 | < 0.1% |
1394674 | 242 | < 0.1% |
1027034 | 241 | < 0.1% |
1640901 | 233 | < 0.1% |
33087 | 230 | < 0.1% |
1923653 | 225 | < 0.1% |
1596778 | 216 | < 0.1% |
925676 | 212 | < 0.1% |
Other values (36437) | 1284315 |
Value | Count | Frequency (%) |
467 | 30 | < 0.1% |
1445 | 83 | |
1934 | 79 | |
3159 | 3 | < 0.1% |
3208 | 15 | < 0.1% |
Value | Count | Frequency (%) |
2703436 | 66 | |
2703377 | 36 | |
2703357 | 54 | |
2702661 | 5 | < 0.1% |
2701996 | 30 |
num_group1
Real number (ℝ)
ZEROS
 
Distinct | 21 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.746740444 |
Minimum | 0 |
---|---|
Maximum | 20 |
Zeros | 723766 |
Zeros (%) | 56.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.8 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 20 |
Range | 20 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 1.121661243 |
---|---|
Coefficient of variation (CV) | 1.50207646 |
Kurtosis | 11.21017492 |
Mean | 0.746740444 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 2.39057725 |
Sum | 960872 |
Variance | 1.258123943 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 723766 | |
1 | 327250 | |
2 | 139937 | 10.9% |
3 | 58034 | 4.5% |
4 | 22693 | 1.8% |
5 | 8694 | 0.7% |
6 | 3354 | 0.3% |
7 | 1414 | 0.1% |
8 | 780 | 0.1% |
9 | 349 | < 0.1% |
Other values (11) | 484 | < 0.1% |
Value | Count | Frequency (%) |
0 | 723766 | |
1 | 327250 | |
2 | 139937 | 10.9% |
3 | 58034 | 4.5% |
4 | 22693 | 1.8% |
Value | Count | Frequency (%) |
20 | 6 | < 0.1% |
19 | 36 | |
18 | 10 | < 0.1% |
17 | 5 | < 0.1% |
16 | 16 |
num_group2
Real number (ℝ)
ZEROS
 
Distinct | 37 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12.31972909 |
Minimum | 0 |
---|---|
Maximum | 36 |
Zeros | 81889 |
Zeros (%) | 6.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.8 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 4 |
median | 10 |
Q3 | 19 |
95-th percentile | 32 |
Maximum | 36 |
Range | 36 |
Interquartile range (IQR) | 15 |
Descriptive statistics
Standard deviation | 10.01712949 |
---|---|
Coefficient of variation (CV) | 0.813096572 |
Kurtosis | -0.6684618538 |
Mean | 12.31972909 |
Median Absolute Deviation (MAD) | 7 |
Skewness | 0.6669823053 |
Sum | 15852473 |
Variance | 100.3428832 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 81889 | 6.4% |
1 | 78332 | 6.1% |
2 | 73867 | 5.7% |
3 | 69081 | 5.4% |
4 | 64582 | 5.0% |
5 | 60414 | 4.7% |
6 | 56217 | 4.4% |
7 | 52446 | 4.1% |
8 | 49081 | 3.8% |
9 | 45908 | 3.6% |
Other values (27) | 654938 |
Value | Count | Frequency (%) |
0 | 81889 | |
1 | 78332 | |
2 | 73867 | |
3 | 69081 | |
4 | 64582 |
Value | Count | Frequency (%) |
36 | 8191 | |
35 | 15029 | |
34 | 15496 | |
33 | 15961 | |
32 | 16457 |
pmts_date_1107D
Text
Distinct | 58 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 9.8 MiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 12867550 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2018-05-15 |
---|---|
2nd row | 2018-11-15 |
3rd row | 2018-04-15 |
4th row | 2016-10-15 |
5th row | 2017-04-15 |
Value | Count | Frequency (%) |
2019-04-15 | 49266 | 3.8% |
2019-05-15 | 48776 | 3.8% |
2019-03-15 | 47654 | 3.7% |
2019-06-15 | 46214 | 3.6% |
2019-02-15 | 45438 | 3.5% |
2019-01-15 | 44165 | 3.4% |
2019-09-15 | 43817 | 3.4% |
2019-10-15 | 43259 | 3.4% |
2018-12-15 | 42182 | 3.3% |
2019-07-15 | 41856 | 3.3% |
Other values (48) | 834128 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 2965267 | |
- | 2573510 | |
0 | 2504213 | |
2 | 1646205 | |
5 | 1396423 | |
9 | 630499 | 4.9% |
8 | 477252 | 3.7% |
7 | 306323 | 2.4% |
6 | 153010 | 1.2% |
4 | 107752 | 0.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 10294040 | |
Dash Punctuation | 2573510 | 20.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 2965267 | |
0 | 2504213 | |
2 | 1646205 | |
5 | 1396423 | |
9 | 630499 | 6.1% |
8 | 477252 | 4.6% |
7 | 306323 | 3.0% |
6 | 153010 | 1.5% |
4 | 107752 | 1.0% |
3 | 107096 | 1.0% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2573510 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 12867550 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 2965267 | |
- | 2573510 | |
0 | 2504213 | |
2 | 1646205 | |
5 | 1396423 | |
9 | 630499 | 4.9% |
8 | 477252 | 3.7% |
7 | 306323 | 2.4% |
6 | 153010 | 1.2% |
4 | 107752 | 0.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 12867550 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 2965267 | |
- | 2573510 | |
0 | 2504213 | |
2 | 1646205 | |
5 | 1396423 | |
9 | 630499 | 4.9% |
8 | 477252 | 3.7% |
7 | 306323 | 2.4% |
6 | 153010 | 1.2% |
4 | 107752 | 0.8% |
pmts_dpdvalue_108P
Real number (ℝ)
SKEWED
  ZEROS
 
Distinct | 62861 |
---|---|
Distinct (%) | 4.9% |
Missing | 5361 |
Missing (%) | 0.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 24370.4546 |
Minimum | 0 |
---|---|
Maximum | 185124192 |
Zeros | 1137032 |
Zeros (%) | 88.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.8 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 47816 |
Maximum | 185124192 |
Range | 185124192 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 574795.5394 |
---|---|
Coefficient of variation (CV) | 23.58575369 |
Kurtosis | 62904.50888 |
Mean | 24370.4546 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 217.5514208 |
Sum | 3.12281543 × 1010 |
Variance | 3.303899121 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1137032 | |
4500 | 418 | < 0.1% |
13500 | 318 | < 0.1% |
9000 | 309 | < 0.1% |
1 | 287 | < 0.1% |
17550 | 172 | < 0.1% |
50400 | 129 | < 0.1% |
45 | 128 | < 0.1% |
35100 | 120 | < 0.1% |
21600 | 117 | < 0.1% |
Other values (62851) | 142364 | 11.1% |
(Missing) | 5361 | 0.4% |
Value | Count | Frequency (%) |
0 | 1137032 | |
1 | 287 | < 0.1% |
2 | 94 | < 0.1% |
3 | 73 | < 0.1% |
4 | 80 | < 0.1% |
Value | Count | Frequency (%) |
185124192 | 7 | |
120050072 | 1 | < 0.1% |
65954828 | 16 | |
38912444 | 1 | < 0.1% |
38749176 | 1 | < 0.1% |
pmts_pmtsoverdue_635A
Real number (ℝ)
SKEWED
  ZEROS
 
Distinct | 4163 |
---|---|
Distinct (%) | 0.3% |
Missing | 5361 |
Missing (%) | 0.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 11.85945366 |
Minimum | 0 |
---|---|
Maximum | 147470.61 |
Zeros | 1136910 |
Zeros (%) | 88.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.8 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 18.2 |
Maximum | 147470.61 |
Range | 147470.61 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 455.1762424 |
---|---|
Coefficient of variation (CV) | 38.38087786 |
Kurtosis | 103110.9284 |
Mean | 11.85945366 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 318.3365368 |
Sum | 15196632.76 |
Variance | 207185.4116 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1136910 | |
0.2 | 14403 | 1.1% |
0.4 | 5374 | 0.4% |
0.8 | 5142 | 0.4% |
0.6 | 4677 | 0.4% |
1 | 3578 | 0.3% |
1.6 | 2977 | 0.2% |
1.4 | 2565 | 0.2% |
2.2 | 2160 | 0.2% |
1.2 | 2115 | 0.2% |
Other values (4153) | 101493 | 7.9% |
(Missing) | 5361 | 0.4% |
Value | Count | Frequency (%) |
0 | 1136910 | |
0.2 | 14403 | 1.1% |
0.4 | 5374 | 0.4% |
0.6 | 4677 | 0.4% |
0.8 | 5142 | 0.4% |
Value | Count | Frequency (%) |
147470.61 | 2 | |
147463.4 | 3 | |
147456.8 | 3 | |
147448.8 | 4 | |
1016.2 | 1 | < 0.1% |