Overview

Dataset statistics

Number of variables5
Number of observations1526659
Missing cells0
Missing cells (%)0.0%
Total size in memory58.2 MiB
Average record size in memory40.0 B

Variable types

Numeric4
Text1

Alerts

case_id has unique valuesUnique
WEEK_NUM has 16735 (1.1%) zerosZeros
target has 1478665 (96.9%) zerosZeros

Reproduction

Analysis started2024-02-13 19:38:12.006909
Analysis finished2024-02-13 19:38:13.818603
Duration1.81 second
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

UNIQUE 

Distinct1526659
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1286076.572
Minimum0
Maximum2703454
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2024-02-13T20:38:13.998672image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile121766.9
Q1766197.5
median1357358
Q31739022.5
95-th percentile2627080.1
Maximum2703454
Range2703454
Interquartile range (IQR)972825

Descriptive statistics

Standard deviation718946.5923
Coefficient of variation (CV)0.5590231624
Kurtosis-0.5871633686
Mean1286076.572
Median Absolute Deviation (MAD)486413
Skewness0.1354512807
Sum1.963400373 × 1012
Variance5.168842026 × 1011
MonotonicityStrictly increasing
2024-02-13T20:38:14.177288image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
1611798 1
 
< 0.1%
1611807 1
 
< 0.1%
1611806 1
 
< 0.1%
1611805 1
 
< 0.1%
1611804 1
 
< 0.1%
1611803 1
 
< 0.1%
1611802 1
 
< 0.1%
1611801 1
 
< 0.1%
1611800 1
 
< 0.1%
Other values (1526649) 1526649
> 99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
ValueCountFrequency (%)
2703454 1
< 0.1%
2703453 1
< 0.1%
2703452 1
< 0.1%
2703451 1
< 0.1%
2703450 1
< 0.1%
Distinct644
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.6 MiB
2024-02-13T20:38:14.609058image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters15266590
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row2019-01-03
2nd row2019-01-03
3rd row2019-01-04
4th row2019-01-03
5th row2019-01-04
ValueCountFrequency (%)
2019-11-29 8812
 
0.6%
2019-11-30 8756
 
0.6%
2019-12-28 6900
 
0.5%
2019-12-29 6537
 
0.4%
2019-11-17 6340
 
0.4%
2019-12-30 6327
 
0.4%
2019-11-16 5882
 
0.4%
2019-12-14 5864
 
0.4%
2019-12-02 5719
 
0.4%
2019-12-15 5635
 
0.4%
Other values (634) 1459887
95.6%
2024-02-13T20:38:15.161467image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3843197
25.2%
- 3053318
20.0%
2 2888882
18.9%
1 2388207
15.6%
9 1382896
 
9.1%
3 352108
 
2.3%
8 307159
 
2.0%
6 290851
 
1.9%
7 283172
 
1.9%
5 243675
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12213272
80.0%
Dash Punctuation 3053318
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3843197
31.5%
2 2888882
23.7%
1 2388207
19.6%
9 1382896
 
11.3%
3 352108
 
2.9%
8 307159
 
2.5%
6 290851
 
2.4%
7 283172
 
2.3%
5 243675
 
2.0%
4 233125
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 3053318
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15266590
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3843197
25.2%
- 3053318
20.0%
2 2888882
18.9%
1 2388207
15.6%
9 1382896
 
9.1%
3 352108
 
2.3%
8 307159
 
2.0%
6 290851
 
1.9%
7 283172
 
1.9%
5 243675
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15266590
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3843197
25.2%
- 3053318
20.0%
2 2888882
18.9%
1 2388207
15.6%
9 1382896
 
9.1%
3 352108
 
2.3%
8 307159
 
2.0%
6 290851
 
1.9%
7 283172
 
1.9%
5 243675
 
1.6%

MONTH
Real number (ℝ)

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201936.288
Minimum201901
Maximum202010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2024-02-13T20:38:15.318983image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum201901
5-th percentile201902
Q1201906
median201910
Q3202001
95-th percentile202008
Maximum202010
Range109
Interquartile range (IQR)95

Descriptive statistics

Standard deviation44.7359745
Coefficient of variation (CV)0.0002215350938
Kurtosis-1.215285612
Mean201936.288
Median Absolute Deviation (MAD)5
Skewness0.8707251911
Sum3.082878515 × 1011
Variance2001.307415
MonotonicityNot monotonic
2024-02-13T20:38:15.455322image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
201912 126011
 
8.3%
201911 115845
 
7.6%
201908 98741
 
6.5%
201909 98706
 
6.5%
201907 97566
 
6.4%
201910 95149
 
6.2%
201906 94398
 
6.2%
202001 86750
 
5.7%
201901 75529
 
4.9%
202002 75183
 
4.9%
Other values (12) 562781
36.9%
ValueCountFrequency (%)
201901 75529
4.9%
201902 63064
4.1%
201903 69147
4.5%
201904 72012
4.7%
201905 64594
4.2%
ValueCountFrequency (%)
202010 8592
 
0.6%
202009 61905
4.1%
202008 50831
3.3%
202007 28912
1.9%
202006 45962
3.0%

WEEK_NUM
Real number (ℝ)

ZEROS 

Distinct92
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.76903618
Minimum0
Maximum91
Zeros16735
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2024-02-13T20:38:15.602325image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q123
median40
Q355
95-th percentile86
Maximum91
Range91
Interquartile range (IQR)32

Descriptive statistics

Standard deviation23.79798129
Coefficient of variation (CV)0.5837268556
Kurtosis-0.6533283719
Mean40.76903618
Median Absolute Deviation (MAD)17
Skewness0.2971192891
Sum62240416
Variance566.3439136
MonotonicityNot monotonic
2024-02-13T20:38:15.755285image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51 35920
 
2.4%
47 31888
 
2.1%
49 30938
 
2.0%
45 28947
 
1.9%
23 26734
 
1.8%
50 26244
 
1.7%
44 25887
 
1.7%
35 24137
 
1.6%
40 23907
 
1.6%
32 23889
 
1.6%
Other values (82) 1248168
81.8%
ValueCountFrequency (%)
0 16735
1.1%
1 18841
1.2%
2 17476
1.1%
3 16108
1.1%
4 14309
0.9%
ValueCountFrequency (%)
91 12674
0.8%
90 12103
0.8%
89 13600
0.9%
88 14234
0.9%
87 17886
1.2%

target
Real number (ℝ)

ZEROS 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.03143727578
Minimum0
Maximum1
Zeros1478665
Zeros (%)96.9%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2024-02-13T20:38:15.879205image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum1
Range1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1744963994
Coefficient of variation (CV)5.550620883
Kurtosis26.8419215
Mean0.03143727578
Median Absolute Deviation (MAD)0
Skewness5.370464257
Sum47994
Variance0.03044899341
MonotonicityNot monotonic
2024-02-13T20:38:16.019244image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)
ValueCountFrequency (%)
0 1478665
96.9%
1 47994
 
3.1%
ValueCountFrequency (%)
0 1478665
96.9%
1 47994
 
3.1%
ValueCountFrequency (%)
1 47994
 
3.1%
0 1478665
96.9%