Dataset statistics
Number of variables | 5 |
---|---|
Number of observations | 1107933 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Total size in memory | 42.3 MiB |
Average record size in memory | 40.0 B |
Variable types
Numeric | 3 |
---|---|
Text | 2 |
num_group1 has 150732 (13.6%) zeros | Zeros |
Reproduction
Analysis started | 2024-02-13 19:58:08.823892 |
---|---|
Analysis finished | 2024-02-13 19:58:11.071055 |
Duration | 2.25 seconds |
Software version | ydata-profiling vv4.6.4 |
Download configuration | config.json |
case_id
Real number (ℝ)
Distinct | 150732 |
---|---|
Distinct (%) | 13.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1469876.122 |
Minimum | 49435 |
---|---|
Maximum | 2703452 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.5 MiB |
Quantile statistics
Minimum | 49435 |
---|---|
5-th percentile | 229480 |
Q1 | 997668 |
median | 1854645 |
Q3 | 1907416 |
95-th percentile | 2686146 |
Maximum | 2703452 |
Range | 2654017 |
Interquartile range (IQR) | 909748 |
Descriptive statistics
Standard deviation | 705344.7771 |
---|---|
Coefficient of variation (CV) | 0.4798668178 |
Kurtosis | -0.6728545046 |
Mean | 1469876.122 |
Median Absolute Deviation (MAD) | 84248 |
Skewness | -0.5004240807 |
Sum | 1.628524261 × 1012 |
Variance | 4.975112545 × 1011 |
Monotonicity | Increasing |
Value | Count | Frequency (%) |
1869915 | 101 | < 0.1% |
2681835 | 83 | < 0.1% |
1917900 | 70 | < 0.1% |
229270 | 65 | < 0.1% |
1863796 | 64 | < 0.1% |
1009026 | 60 | < 0.1% |
1861451 | 59 | < 0.1% |
1853166 | 58 | < 0.1% |
990891 | 56 | < 0.1% |
242302 | 54 | < 0.1% |
Other values (150722) | 1107263 |
Value | Count | Frequency (%) |
49435 | 11 | |
49490 | 6 | |
49526 | 2 | < 0.1% |
49563 | 11 | |
49576 | 11 |
Value | Count | Frequency (%) |
2703452 | 6 | |
2703449 | 8 | |
2703448 | 6 | |
2703445 | 5 | |
2703443 | 6 |
amount_4917619A
Real number (ℝ)
Distinct | 191635 |
---|---|
Distinct (%) | 17.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20104.96572 |
Minimum | 0 |
---|---|
Maximum | 344250 |
Zeros | 32 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.5 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 1735 |
Q1 | 6885 |
median | 13130.2 |
Q3 | 24300 |
95-th percentile | 58140.5612 |
Maximum | 344250 |
Range | 344250 |
Interquartile range (IQR) | 17415 |
Descriptive statistics
Standard deviation | 25201.74513 |
---|---|
Coefficient of variation (CV) | 1.253508485 |
Kurtosis | 43.42465887 |
Mean | 20104.96572 |
Median Absolute Deviation (MAD) | 7124.3997 |
Skewness | 5.169141853 |
Sum | 2.227495499 × 1010 |
Variance | 635127957.8 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6885 | 57533 | 5.2% |
8100 | 16295 | 1.5% |
644.2 | 9243 | 0.8% |
16200 | 8442 | 0.8% |
9720 | 7070 | 0.6% |
7290 | 5882 | 0.5% |
1288.4 | 5426 | 0.5% |
11340 | 4669 | 0.4% |
12960 | 4209 | 0.4% |
24300 | 3908 | 0.4% |
Other values (191625) | 985256 |
Value | Count | Frequency (%) |
0 | 32 | < 0.1% |
0.2 | 23 | < 0.1% |
0.4 | 23 | < 0.1% |
0.6 | 8 | < 0.1% |
0.8 | 88 |
Value | Count | Frequency (%) |
344250 | 755 | |
344248.4 | 3 | < 0.1% |
344162 | 1 | < 0.1% |
344036.22 | 1 | < 0.1% |
344018.4 | 1 | < 0.1% |
Distinct | 260 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.5 MiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 11079330 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2019-10-16 |
---|---|
2nd row | 2019-10-16 |
3rd row | 2019-10-16 |
4th row | 2019-10-16 |
5th row | 2019-10-16 |
Value | Count | Frequency (%) |
2020-04-03 | 20426 | 1.8% |
2020-04-09 | 19943 | 1.8% |
2020-05-06 | 16006 | 1.4% |
2020-06-04 | 15974 | 1.4% |
2020-06-05 | 15145 | 1.4% |
2020-06-08 | 14735 | 1.3% |
2020-05-05 | 14713 | 1.3% |
2020-04-02 | 14533 | 1.3% |
2020-05-08 | 13750 | 1.2% |
2020-05-11 | 13408 | 1.2% |
Other values (250) | 949300 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 3855766 | |
2 | 2612860 | |
- | 2215866 | |
1 | 618372 | 5.6% |
3 | 321552 | 2.9% |
4 | 281439 | 2.5% |
6 | 270104 | 2.4% |
5 | 254847 | 2.3% |
7 | 241968 | 2.2% |
9 | 214568 | 1.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 8863464 | |
Dash Punctuation | 2215866 | 20.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 3855766 | |
2 | 2612860 | |
1 | 618372 | 7.0% |
3 | 321552 | 3.6% |
4 | 281439 | 3.2% |
6 | 270104 | 3.0% |
5 | 254847 | 2.9% |
7 | 241968 | 2.7% |
9 | 214568 | 2.4% |
8 | 191988 | 2.2% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2215866 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 11079330 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 3855766 | |
2 | 2612860 | |
- | 2215866 | |
1 | 618372 | 5.6% |
3 | 321552 | 2.9% |
4 | 281439 | 2.5% |
6 | 270104 | 2.4% |
5 | 254847 | 2.3% |
7 | 241968 | 2.2% |
9 | 214568 | 1.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 11079330 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 3855766 | |
2 | 2612860 | |
- | 2215866 | |
1 | 618372 | 5.6% |
3 | 321552 | 2.9% |
4 | 281439 | 2.5% |
6 | 270104 | 2.4% |
5 | 254847 | 2.3% |
7 | 241968 | 2.2% |
9 | 214568 | 1.9% |
name_4917606M
Text
Distinct | 55857 |
---|---|
Distinct (%) | 5.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.5 MiB |
Length
Max length | 12 |
---|---|
Median length | 8 |
Mean length | 8.056925825 |
Min length | 8 |
Characters and Unicode
Total characters | 8926534 |
---|---|
Distinct characters | 18 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6154 ? |
---|---|
Unique (%) | 0.6% |
Sample
1st row | 6b730375 |
---|---|
2nd row | 6b730375 |
3rd row | 6b730375 |
4th row | 6b730375 |
5th row | 6b730375 |
Value | Count | Frequency (%) |
5e180ef0 | 85284 | 7.7% |
p114_118_163 | 7205 | 0.7% |
74ca9587 | 7153 | 0.6% |
7444479d | 5173 | 0.5% |
3613fb71 | 4079 | 0.4% |
a409d8fa | 3529 | 0.3% |
36a9355c | 3499 | 0.3% |
e304888c | 3465 | 0.3% |
cda1fd10 | 3278 | 0.3% |
c75d2f47 | 3203 | 0.3% |
Other values (55847) | 982065 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 675640 | 7.6% |
0 | 673508 | 7.5% |
e | 664650 | 7.4% |
5 | 605680 | 6.8% |
8 | 599952 | 6.7% |
f | 562940 | 6.3% |
7 | 527523 | 5.9% |
d | 524884 | 5.9% |
4 | 523107 | 5.9% |
3 | 518634 | 5.8% |
Other values (8) | 3050016 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 5632463 | |
Lowercase Letter | 3232751 | |
Connector Punctuation | 40880 | 0.5% |
Uppercase Letter | 20440 | 0.2% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 675640 | |
0 | 673508 | |
5 | 605680 | |
8 | 599952 | |
7 | 527523 | |
4 | 523107 | |
3 | 518634 | |
9 | 512812 | |
6 | 502308 | |
2 | 493299 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 664650 | |
f | 562940 | |
d | 524884 | |
c | 499668 | |
a | 491977 | |
b | 488632 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 40880 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 20440 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 5673343 | |
Latin | 3253191 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 675640 | |
0 | 673508 | |
5 | 605680 | |
8 | 599952 | |
7 | 527523 | |
4 | 523107 | |
3 | 518634 | |
9 | 512812 | |
6 | 502308 | |
2 | 493299 |
Latin
Value | Count | Frequency (%) |
e | 664650 | |
f | 562940 | |
d | 524884 | |
c | 499668 | |
a | 491977 | |
b | 488632 | |
P | 20440 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 8926534 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 675640 | 7.6% |
0 | 673508 | 7.5% |
e | 664650 | 7.4% |
5 | 605680 | 6.8% |
8 | 599952 | 6.7% |
f | 562940 | 6.3% |
7 | 527523 | 5.9% |
d | 524884 | 5.9% |
4 | 523107 | 5.9% |
3 | 518634 | 5.8% |
Other values (8) | 3050016 |
num_group1
Real number (ℝ)
ZEROS
 
Distinct | 101 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.144719942 |
Minimum | 0 |
---|---|
Maximum | 100 |
Zeros | 150732 |
Zeros (%) | 13.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.5 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 1 |
median | 3 |
Q3 | 5 |
95-th percentile | 12 |
Maximum | 100 |
Range | 100 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 4.108048495 |
---|---|
Coefficient of variation (CV) | 0.9911522495 |
Kurtosis | 16.79635319 |
Mean | 4.144719942 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 2.609646326 |
Sum | 4592072 |
Variance | 16.87606243 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 150732 | |
1 | 149210 | |
2 | 146224 | |
3 | 141814 | |
4 | 135063 | |
5 | 116602 | |
6 | 59418 | 5.4% |
7 | 42568 | 3.8% |
8 | 34229 | 3.1% |
9 | 28762 | 2.6% |
Other values (91) | 103311 |
Value | Count | Frequency (%) |
0 | 150732 | |
1 | 149210 | |
2 | 146224 | |
3 | 141814 | |
4 | 135063 |
Value | Count | Frequency (%) |
100 | 1 | |
99 | 1 | |
98 | 1 | |
97 | 1 | |
96 | 1 |