Dataset statistics
| Number of variables | 6 |
|---|---|
| Number of observations | 1286755 |
| Missing cells | 10722 |
| Missing cells (%) | 0.1% |
| Total size in memory | 58.9 MiB |
| Average record size in memory | 48.0 B |
Variable types
| Numeric | 5 |
|---|---|
| Text | 1 |
pmts_dpdvalue_108P is highly skewed (γ1 = 217.5514208) | Skewed |
pmts_pmtsoverdue_635A is highly skewed (γ1 = 318.3365368) | Skewed |
num_group1 has 723766 (56.2%) zeros | Zeros |
num_group2 has 81889 (6.4%) zeros | Zeros |
pmts_dpdvalue_108P has 1137032 (88.4%) zeros | Zeros |
pmts_pmtsoverdue_635A has 1136910 (88.4%) zeros | Zeros |
Reproduction
| Analysis started | 2024-02-13 19:53:21.322245 |
|---|---|
| Analysis finished | 2024-02-13 19:53:22.209261 |
| Duration | 0.89 seconds |
| Software version | ydata-profiling vv4.6.4 |
| Download configuration | config.json |
case_id
Real number (ℝ)
| Distinct | 36447 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1229443.91 |
| Minimum | 467 |
|---|---|
| Maximum | 2703436 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.8 MiB |
Quantile statistics
| Minimum | 467 |
|---|---|
| 5-th percentile | 127544 |
| Q1 | 741898 |
| median | 1416105 |
| Q3 | 1781534 |
| 95-th percentile | 1939410 |
| Maximum | 2703436 |
| Range | 2702969 |
| Interquartile range (IQR) | 1039636 |
Descriptive statistics
| Standard deviation | 679992.3043 |
|---|---|
| Coefficient of variation (CV) | 0.5530893267 |
| Kurtosis | -0.8850089794 |
| Mean | 1229443.91 |
| Median Absolute Deviation (MAD) | 462324 |
| Skewness | -0.2751192657 |
| Sum | 1.581993098 × 1012 |
| Variance | 4.62389534 × 1011 |
| Monotonicity | Increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1828337 | 301 | < 0.1% |
| 938117 | 297 | < 0.1% |
| 1690585 | 243 | < 0.1% |
| 1394674 | 242 | < 0.1% |
| 1027034 | 241 | < 0.1% |
| 1640901 | 233 | < 0.1% |
| 33087 | 230 | < 0.1% |
| 1923653 | 225 | < 0.1% |
| 1596778 | 216 | < 0.1% |
| 925676 | 212 | < 0.1% |
| Other values (36437) | 1284315 |
| Value | Count | Frequency (%) |
| 467 | 30 | < 0.1% |
| 1445 | 83 | |
| 1934 | 79 | |
| 3159 | 3 | < 0.1% |
| 3208 | 15 | < 0.1% |
| Value | Count | Frequency (%) |
| 2703436 | 66 | |
| 2703377 | 36 | |
| 2703357 | 54 | |
| 2702661 | 5 | < 0.1% |
| 2701996 | 30 |
num_group1
Real number (ℝ)
ZEROS 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.746740444 |
| Minimum | 0 |
|---|---|
| Maximum | 20 |
| Zeros | 723766 |
| Zeros (%) | 56.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.121661243 |
|---|---|
| Coefficient of variation (CV) | 1.50207646 |
| Kurtosis | 11.21017492 |
| Mean | 0.746740444 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.39057725 |
| Sum | 960872 |
| Variance | 1.258123943 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=21)
| Value | Count | Frequency (%) |
| 0 | 723766 | |
| 1 | 327250 | |
| 2 | 139937 | 10.9% |
| 3 | 58034 | 4.5% |
| 4 | 22693 | 1.8% |
| 5 | 8694 | 0.7% |
| 6 | 3354 | 0.3% |
| 7 | 1414 | 0.1% |
| 8 | 780 | 0.1% |
| 9 | 349 | < 0.1% |
| Other values (11) | 484 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 723766 | |
| 1 | 327250 | |
| 2 | 139937 | 10.9% |
| 3 | 58034 | 4.5% |
| 4 | 22693 | 1.8% |
| Value | Count | Frequency (%) |
| 20 | 6 | < 0.1% |
| 19 | 36 | |
| 18 | 10 | < 0.1% |
| 17 | 5 | < 0.1% |
| 16 | 16 |
num_group2
Real number (ℝ)
ZEROS 
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.31972909 |
| Minimum | 0 |
|---|---|
| Maximum | 36 |
| Zeros | 81889 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4 |
| median | 10 |
| Q3 | 19 |
| 95-th percentile | 32 |
| Maximum | 36 |
| Range | 36 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 10.01712949 |
|---|---|
| Coefficient of variation (CV) | 0.813096572 |
| Kurtosis | -0.6684618538 |
| Mean | 12.31972909 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.6669823053 |
| Sum | 15852473 |
| Variance | 100.3428832 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=37)
| Value | Count | Frequency (%) |
| 0 | 81889 | 6.4% |
| 1 | 78332 | 6.1% |
| 2 | 73867 | 5.7% |
| 3 | 69081 | 5.4% |
| 4 | 64582 | 5.0% |
| 5 | 60414 | 4.7% |
| 6 | 56217 | 4.4% |
| 7 | 52446 | 4.1% |
| 8 | 49081 | 3.8% |
| 9 | 45908 | 3.6% |
| Other values (27) | 654938 |
| Value | Count | Frequency (%) |
| 0 | 81889 | |
| 1 | 78332 | |
| 2 | 73867 | |
| 3 | 69081 | |
| 4 | 64582 |
| Value | Count | Frequency (%) |
| 36 | 8191 | |
| 35 | 15029 | |
| 34 | 15496 | |
| 33 | 15961 | |
| 32 | 16457 |
pmts_date_1107D
Text
| Distinct | 58 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.8 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 12867550 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2018-05-15 |
|---|---|
| 2nd row | 2018-11-15 |
| 3rd row | 2018-04-15 |
| 4th row | 2016-10-15 |
| 5th row | 2017-04-15 |
| Value | Count | Frequency (%) |
| 2019-04-15 | 49266 | 3.8% |
| 2019-05-15 | 48776 | 3.8% |
| 2019-03-15 | 47654 | 3.7% |
| 2019-06-15 | 46214 | 3.6% |
| 2019-02-15 | 45438 | 3.5% |
| 2019-01-15 | 44165 | 3.4% |
| 2019-09-15 | 43817 | 3.4% |
| 2019-10-15 | 43259 | 3.4% |
| 2018-12-15 | 42182 | 3.3% |
| 2019-07-15 | 41856 | 3.3% |
| Other values (48) | 834128 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2965267 | |
| - | 2573510 | |
| 0 | 2504213 | |
| 2 | 1646205 | |
| 5 | 1396423 | |
| 9 | 630499 | 4.9% |
| 8 | 477252 | 3.7% |
| 7 | 306323 | 2.4% |
| 6 | 153010 | 1.2% |
| 4 | 107752 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 10294040 | |
| Dash Punctuation | 2573510 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2965267 | |
| 0 | 2504213 | |
| 2 | 1646205 | |
| 5 | 1396423 | |
| 9 | 630499 | 6.1% |
| 8 | 477252 | 4.6% |
| 7 | 306323 | 3.0% |
| 6 | 153010 | 1.5% |
| 4 | 107752 | 1.0% |
| 3 | 107096 | 1.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2573510 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 12867550 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2965267 | |
| - | 2573510 | |
| 0 | 2504213 | |
| 2 | 1646205 | |
| 5 | 1396423 | |
| 9 | 630499 | 4.9% |
| 8 | 477252 | 3.7% |
| 7 | 306323 | 2.4% |
| 6 | 153010 | 1.2% |
| 4 | 107752 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12867550 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2965267 | |
| - | 2573510 | |
| 0 | 2504213 | |
| 2 | 1646205 | |
| 5 | 1396423 | |
| 9 | 630499 | 4.9% |
| 8 | 477252 | 3.7% |
| 7 | 306323 | 2.4% |
| 6 | 153010 | 1.2% |
| 4 | 107752 | 0.8% |
pmts_dpdvalue_108P
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 62861 |
|---|---|
| Distinct (%) | 4.9% |
| Missing | 5361 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24370.4546 |
| Minimum | 0 |
|---|---|
| Maximum | 185124192 |
| Zeros | 1137032 |
| Zeros (%) | 88.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 47816 |
| Maximum | 185124192 |
| Range | 185124192 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 574795.5394 |
|---|---|
| Coefficient of variation (CV) | 23.58575369 |
| Kurtosis | 62904.50888 |
| Mean | 24370.4546 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 217.5514208 |
| Sum | 3.12281543 × 1010 |
| Variance | 3.303899121 × 1011 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 1137032 | |
| 4500 | 418 | < 0.1% |
| 13500 | 318 | < 0.1% |
| 9000 | 309 | < 0.1% |
| 1 | 287 | < 0.1% |
| 17550 | 172 | < 0.1% |
| 50400 | 129 | < 0.1% |
| 45 | 128 | < 0.1% |
| 35100 | 120 | < 0.1% |
| 21600 | 117 | < 0.1% |
| Other values (62851) | 142364 | 11.1% |
| (Missing) | 5361 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 1137032 | |
| 1 | 287 | < 0.1% |
| 2 | 94 | < 0.1% |
| 3 | 73 | < 0.1% |
| 4 | 80 | < 0.1% |
| Value | Count | Frequency (%) |
| 185124192 | 7 | |
| 120050072 | 1 | < 0.1% |
| 65954828 | 16 | |
| 38912444 | 1 | < 0.1% |
| 38749176 | 1 | < 0.1% |
pmts_pmtsoverdue_635A
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 4163 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 5361 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.85945366 |
| Minimum | 0 |
|---|---|
| Maximum | 147470.61 |
| Zeros | 1136910 |
| Zeros (%) | 88.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 18.2 |
| Maximum | 147470.61 |
| Range | 147470.61 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 455.1762424 |
|---|---|
| Coefficient of variation (CV) | 38.38087786 |
| Kurtosis | 103110.9284 |
| Mean | 11.85945366 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 318.3365368 |
| Sum | 15196632.76 |
| Variance | 207185.4116 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 1136910 | |
| 0.2 | 14403 | 1.1% |
| 0.4 | 5374 | 0.4% |
| 0.8 | 5142 | 0.4% |
| 0.6 | 4677 | 0.4% |
| 1 | 3578 | 0.3% |
| 1.6 | 2977 | 0.2% |
| 1.4 | 2565 | 0.2% |
| 2.2 | 2160 | 0.2% |
| 1.2 | 2115 | 0.2% |
| Other values (4153) | 101493 | 7.9% |
| (Missing) | 5361 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 1136910 | |
| 0.2 | 14403 | 1.1% |
| 0.4 | 5374 | 0.4% |
| 0.6 | 4677 | 0.4% |
| 0.8 | 5142 | 0.4% |
| Value | Count | Frequency (%) |
| 147470.61 | 2 | |
| 147463.4 | 3 | |
| 147456.8 | 3 | |
| 147448.8 | 4 | |
| 1016.2 | 1 | < 0.1% |