Overview

Dataset statistics

Number of variables41
Number of observations3887684
Missing cells49311059
Missing cells (%)30.9%
Total size in memory1.2 GiB
Average record size in memory328.0 B

Variable types

Numeric20
Text19
Boolean2

Alerts

isbidproduct_390L is highly imbalanced (68.7%)Imbalance
annuity_853A has 155851 (4.0%) missing valuesMissing
approvaldate_319D has 1766021 (45.4%) missing valuesMissing
byoccupationinc_3656910L has 2896024 (74.5%) missing valuesMissing
childnum_21L has 1953893 (50.3%) missing valuesMissing
credacc_actualbalance_314A has 3719506 (95.7%) missing valuesMissing
credacc_credlmt_575A has 119070 (3.1%) missing valuesMissing
credacc_maxhisbal_375A has 3719506 (95.7%) missing valuesMissing
credacc_minhisbal_90A has 3719506 (95.7%) missing valuesMissing
credacc_status_367L has 3719506 (95.7%) missing valuesMissing
credacc_transactions_402L has 3719506 (95.7%) missing valuesMissing
credamount_590A has 123329 (3.2%) missing valuesMissing
credtype_587L has 123329 (3.2%) missing valuesMissing
currdebt_94A has 1270377 (32.7%) missing valuesMissing
dateactivated_425D has 1844702 (47.4%) missing valuesMissing
downpmt_134A has 123329 (3.2%) missing valuesMissing
dtlastpmt_581D has 2860375 (73.6%) missing valuesMissing
dtlastpmtallstes_3545839D has 2434155 (62.6%) missing valuesMissing
employedfrom_700D has 2180869 (56.1%) missing valuesMissing
familystate_726L has 1245201 (32.0%) missing valuesMissing
firstnonzeroinstldate_307D has 365175 (9.4%) missing valuesMissing
inittransactioncode_279L has 123329 (3.2%) missing valuesMissing
isdebitcard_527L has 3637550 (93.6%) missing valuesMissing
maxdpdtolerance_577P has 1817378 (46.7%) missing valuesMissing
outstandingdebt_522A has 1277922 (32.9%) missing valuesMissing
pmtnum_8L has 312833 (8.0%) missing valuesMissing
revolvingaccount_394A has 3731033 (96.0%) missing valuesMissing
tenor_203L has 312833 (8.0%) missing valuesMissing
actualdpd_943P is highly skewed (γ1 = 716.1410421)Skewed
credacc_maxhisbal_375A is highly skewed (γ1 = 154.6093224)Skewed
actualdpd_943P has 3882797 (99.9%) zerosZeros
annuity_853A has 225443 (5.8%) zerosZeros
byoccupationinc_3656910L has 63137 (1.6%) zerosZeros
childnum_21L has 1054010 (27.1%) zerosZeros
credacc_credlmt_575A has 3470494 (89.3%) zerosZeros
credacc_maxhisbal_375A has 98392 (2.5%) zerosZeros
credacc_minhisbal_90A has 100911 (2.6%) zerosZeros
credacc_transactions_402L has 150233 (3.9%) zerosZeros
credamount_590A has 42592 (1.1%) zerosZeros
currdebt_94A has 2238778 (57.6%) zerosZeros
downpmt_134A has 3381858 (87.0%) zerosZeros
maxdpdtolerance_577P has 1527003 (39.3%) zerosZeros
num_group1 has 782997 (20.1%) zerosZeros
outstandingdebt_522A has 2229538 (57.3%) zerosZeros

Reproduction

Analysis started2024-02-13 19:36:08.396202
Analysis finished2024-02-13 19:36:34.918128
Duration26.52 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

case_id
Real number (ℝ)

Distinct782997
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1397916.185
Minimum2
Maximum2651092
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:35.113232image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile123162
Q11251506
median1451959
Q31641585
95-th percentile2617890
Maximum2651092
Range2651090
Interquartile range (IQR)390079

Descriptive statistics

Standard deviation760159.4198
Coefficient of variation (CV)0.5437803984
Kurtosis-0.5110858461
Mean1397916.185
Median Absolute Deviation (MAD)194660
Skewness-0.1345953203
Sum5.434656384 × 1012
Variance5.778423435 × 1011
MonotonicityIncreasing
2024-02-13T20:36:35.296874image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1451156 20
 
< 0.1%
149924 20
 
< 0.1%
1617749 20
 
< 0.1%
149841 20
 
< 0.1%
149843 20
 
< 0.1%
2538422 20
 
< 0.1%
177936 20
 
< 0.1%
2538424 20
 
< 0.1%
111866 20
 
< 0.1%
2588419 20
 
< 0.1%
Other values (782987) 3887484
> 99.9%
ValueCountFrequency (%)
2 2
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 3
< 0.1%
ValueCountFrequency (%)
2651092 8
< 0.1%
2651091 3
 
< 0.1%
2651090 2
 
< 0.1%
2651089 12
< 0.1%
2651088 13
< 0.1%

actualdpd_943P
Real number (ℝ)

SKEWED  ZEROS 

Distinct101
Distinct (%)< 0.1%
Missing2234
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.01056299785
Minimum0
Maximum3676
Zeros3882797
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:35.462835image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum3676
Range3676
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.75427732
Coefficient of variation (CV)355.417787
Kurtosis586099.5401
Mean0.01056299785
Median Absolute Deviation (MAD)0
Skewness716.1410421
Sum41042
Variance14.0945982
MonotonicityNot monotonic
2024-02-13T20:36:35.634835image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3882797
99.9%
1 870
 
< 0.1%
2 530
 
< 0.1%
3 338
 
< 0.1%
4 174
 
< 0.1%
5 118
 
< 0.1%
6 85
 
< 0.1%
7 64
 
< 0.1%
8 58
 
< 0.1%
9 46
 
< 0.1%
Other values (91) 370
 
< 0.1%
(Missing) 2234
 
0.1%
ValueCountFrequency (%)
0 3882797
99.9%
1 870
 
< 0.1%
2 530
 
< 0.1%
3 338
 
< 0.1%
4 174
 
< 0.1%
ValueCountFrequency (%)
3676 1
< 0.1%
3661 1
< 0.1%
2119 1
< 0.1%
2107 1
< 0.1%
1957 1
< 0.1%

annuity_853A
Real number (ℝ)

MISSING  ZEROS 

Distinct80309
Distinct (%)2.2%
Missing155851
Missing (%)4.0%
Infinite0
Infinite (%)0.0%
Mean3413.166466
Minimum0
Maximum105130.2
Zeros225443
Zeros (%)5.8%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:35.809692image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11710.6
median2761.2
Q34396.4
95-th percentile8325.2
Maximum105130.2
Range105130.2
Interquartile range (IQR)2685.8

Descriptive statistics

Standard deviation2828.269
Coefficient of variation (CV)0.8286349432
Kurtosis32.4088215
Mean3413.166466
Median Absolute Deviation (MAD)1247.4
Skewness3.37853393
Sum1.273736725 × 1010
Variance7999105.539
MonotonicityNot monotonic
2024-02-13T20:36:35.982332image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 225443
 
5.8%
1580 3773
 
0.1%
1508 3156
 
0.1%
2716 2434
 
0.1%
2558.4001 1736
 
< 0.1%
2103 1685
 
< 0.1%
1668 1676
 
< 0.1%
2000 1666
 
< 0.1%
3837.4001 1625
 
< 0.1%
3840 1609
 
< 0.1%
Other values (80299) 3487030
89.7%
(Missing) 155851
 
4.0%
ValueCountFrequency (%)
0 225443
5.8%
1.8000001 1
 
< 0.1%
7.6 2
 
< 0.1%
8.2 2
 
< 0.1%
10.400001 1
 
< 0.1%
ValueCountFrequency (%)
105130.2 1
< 0.1%
103000 1
< 0.1%
100061.4 1
< 0.1%
99837.4 2
< 0.1%
99646.6 1
< 0.1%

approvaldate_319D
Text

MISSING 

Distinct5105
Distinct (%)0.2%
Missing1766021
Missing (%)45.4%
Memory size29.7 MiB
2024-02-13T20:36:36.411557image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters21216630
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row2019-01-11
2nd row2018-10-11
3rd row2018-12-31
4th row2018-11-02
5th row2018-12-11
ValueCountFrequency (%)
2018-12-07 2982
 
0.1%
2018-01-12 2867
 
0.1%
2018-01-13 2720
 
0.1%
2018-12-08 2567
 
0.1%
2019-01-13 2387
 
0.1%
2018-12-29 2273
 
0.1%
2018-07-27 2215
 
0.1%
2018-12-28 2193
 
0.1%
2018-11-24 2161
 
0.1%
2017-12-02 2126
 
0.1%
Other values (5095) 2097172
98.8%
2024-02-13T20:36:36.961509image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4872977
23.0%
- 4243326
20.0%
1 3932682
18.5%
2 3489665
16.4%
8 964186
 
4.5%
7 774209
 
3.6%
9 687236
 
3.2%
3 618144
 
2.9%
6 606097
 
2.9%
5 522667
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16973304
80.0%
Dash Punctuation 4243326
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4872977
28.7%
1 3932682
23.2%
2 3489665
20.6%
8 964186
 
5.7%
7 774209
 
4.6%
9 687236
 
4.0%
3 618144
 
3.6%
6 606097
 
3.6%
5 522667
 
3.1%
4 505441
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 4243326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21216630
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4872977
23.0%
- 4243326
20.0%
1 3932682
18.5%
2 3489665
16.4%
8 964186
 
4.5%
7 774209
 
3.6%
9 687236
 
3.2%
3 618144
 
2.9%
6 606097
 
2.9%
5 522667
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21216630
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4872977
23.0%
- 4243326
20.0%
1 3932682
18.5%
2 3489665
16.4%
8 964186
 
4.5%
7 774209
 
3.6%
9 687236
 
3.2%
3 618144
 
2.9%
6 606097
 
2.9%
5 522667
 
2.5%

byoccupationinc_3656910L
Real number (ℝ)

MISSING  ZEROS 

Distinct24298
Distinct (%)2.5%
Missing2896024
Missing (%)74.5%
Infinite0
Infinite (%)0.0%
Mean19796.48403
Minimum0
Maximum200000
Zeros63137
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:37.131386image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5000
Q330000
95-th percentile71316.25
Maximum200000
Range200000
Interquartile range (IQR)29999

Descriptive statistics

Standard deviation30687.65251
Coefficient of variation (CV)1.550156708
Kurtosis10.76351177
Mean19796.48403
Median Absolute Deviation (MAD)5000
Skewness2.799320872
Sum1.963138135 × 1010
Variance941732016.4
MonotonicityNot monotonic
2024-02-13T20:36:37.301583image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 424404
 
10.9%
0 63137
 
1.6%
15000 50750
 
1.3%
20000 42942
 
1.1%
30000 38439
 
1.0%
25000 34911
 
0.9%
50000 33424
 
0.9%
10000 23527
 
0.6%
40000 19665
 
0.5%
35000 19252
 
0.5%
Other values (24288) 241209
 
6.2%
(Missing) 2896024
74.5%
ValueCountFrequency (%)
0 63137
 
1.6%
1 424404
10.9%
2 9
 
< 0.1%
3 1
 
< 0.1%
4 3
 
< 0.1%
ValueCountFrequency (%)
200000 7073
0.2%
199000 12
 
< 0.1%
198600 3
 
< 0.1%
198000 15
 
< 0.1%
197000 9
 
< 0.1%
Distinct71
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:37.512727image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.84928379
Min length8

Characters and Unicode

Total characters34403219
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowP94_109_143
4th rowP24_27_36
5th rowP85_114_140
ValueCountFrequency (%)
a55475b1 2729942
70.2%
p94_109_143 848995
 
21.8%
p180_60_137 43051
 
1.1%
p73_130_169 42479
 
1.1%
p198_89_166 37300
 
1.0%
p85_114_140 34868
 
0.9%
p30_86_84 31802
 
0.8%
p24_27_36 16040
 
0.4%
p141_135_146 15512
 
0.4%
p52_67_90 13724
 
0.4%
Other values (61) 73971
 
1.9%
2024-02-13T20:36:37.863564image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 8288778
24.1%
1 5002986
14.5%
4 4590459
13.3%
7 2870275
 
8.3%
a 2729942
 
7.9%
b 2729942
 
7.9%
_ 2315484
 
6.7%
9 1874613
 
5.4%
P 1157742
 
3.4%
0 1109335
 
3.2%
Other values (4) 1733663
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25470109
74.0%
Lowercase Letter 5459884
 
15.9%
Connector Punctuation 2315484
 
6.7%
Uppercase Letter 1157742
 
3.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 8288778
32.5%
1 5002986
19.6%
4 4590459
18.0%
7 2870275
 
11.3%
9 1874613
 
7.4%
0 1109335
 
4.4%
3 1084801
 
4.3%
6 302589
 
1.2%
8 252522
 
1.0%
2 93751
 
0.4%
Lowercase Letter
ValueCountFrequency (%)
a 2729942
50.0%
b 2729942
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2315484
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1157742
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27785593
80.8%
Latin 6617626
 
19.2%

Most frequent character per script

Common
ValueCountFrequency (%)
5 8288778
29.8%
1 5002986
18.0%
4 4590459
16.5%
7 2870275
 
10.3%
_ 2315484
 
8.3%
9 1874613
 
6.7%
0 1109335
 
4.0%
3 1084801
 
3.9%
6 302589
 
1.1%
8 252522
 
0.9%
Latin
ValueCountFrequency (%)
a 2729942
41.3%
b 2729942
41.3%
P 1157742
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34403219
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 8288778
24.1%
1 5002986
14.5%
4 4590459
13.3%
7 2870275
 
8.3%
a 2729942
 
7.9%
b 2729942
 
7.9%
_ 2315484
 
6.7%
9 1874613
 
5.4%
P 1157742
 
3.4%
0 1109335
 
3.2%
Other values (4) 1733663
 
5.0%

childnum_21L
Real number (ℝ)

MISSING  ZEROS 

Distinct20
Distinct (%)< 0.1%
Missing1953893
Missing (%)50.3%
Infinite0
Infinite (%)0.0%
Mean0.8434701578
Minimum0
Maximum20
Zeros1054010
Zeros (%)27.1%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:38.001566image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum20
Range20
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.209425285
Coefficient of variation (CV)1.433868494
Kurtosis6.155115887
Mean0.8434701578
Median Absolute Deviation (MAD)0
Skewness1.989613502
Sum1631095
Variance1.46270952
MonotonicityNot monotonic
2024-02-13T20:36:38.137117image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 1054010
27.1%
1 430613
 
11.1%
2 277280
 
7.1%
3 99088
 
2.5%
4 39691
 
1.0%
5 19363
 
0.5%
6 8144
 
0.2%
7 3028
 
0.1%
8 1401
 
< 0.1%
9 539
 
< 0.1%
Other values (10) 634
 
< 0.1%
(Missing) 1953893
50.3%
ValueCountFrequency (%)
0 1054010
27.1%
1 430613
11.1%
2 277280
 
7.1%
3 99088
 
2.5%
4 39691
 
1.0%
ValueCountFrequency (%)
20 12
< 0.1%
18 5
 
< 0.1%
17 4
 
< 0.1%
16 2
 
< 0.1%
15 18
< 0.1%
Distinct5107
Distinct (%)0.1%
Missing35
Missing (%)< 0.1%
Memory size29.7 MiB
2024-02-13T20:36:38.547007image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters38876490
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row2013-04-03
2nd row2013-04-03
3rd row2019-01-07
4th row2019-01-08
5th row2019-01-16
ValueCountFrequency (%)
2018-12-07 4961
 
0.1%
2018-01-12 4168
 
0.1%
2018-12-08 3986
 
0.1%
2018-12-28 3940
 
0.1%
2019-01-02 3859
 
0.1%
2018-01-13 3799
 
0.1%
2019-01-04 3796
 
0.1%
2018-07-27 3692
 
0.1%
2019-01-13 3646
 
0.1%
2019-01-11 3618
 
0.1%
Other values (5097) 3848184
99.0%
2024-02-13T20:36:39.099588image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8914984
22.9%
- 7775298
20.0%
1 7156075
18.4%
2 6398126
16.5%
8 1706927
 
4.4%
7 1362854
 
3.5%
9 1346334
 
3.5%
3 1148470
 
3.0%
6 1091746
 
2.8%
4 994262
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 31101192
80.0%
Dash Punctuation 7775298
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8914984
28.7%
1 7156075
23.0%
2 6398126
20.6%
8 1706927
 
5.5%
7 1362854
 
4.4%
9 1346334
 
4.3%
3 1148470
 
3.7%
6 1091746
 
3.5%
4 994262
 
3.2%
5 981414
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 7775298
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 38876490
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8914984
22.9%
- 7775298
20.0%
1 7156075
18.4%
2 6398126
16.5%
8 1706927
 
4.4%
7 1362854
 
3.5%
9 1346334
 
3.5%
3 1148470
 
3.0%
6 1091746
 
2.8%
4 994262
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38876490
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8914984
22.9%
- 7775298
20.0%
1 7156075
18.4%
2 6398126
16.5%
8 1706927
 
4.4%
7 1362854
 
3.5%
9 1346334
 
3.5%
3 1148470
 
3.0%
6 1091746
 
2.8%
4 994262
 
2.6%

credacc_actualbalance_314A
Real number (ℝ)

MISSING 

Distinct57143
Distinct (%)34.0%
Missing3719506
Missing (%)95.7%
Infinite0
Infinite (%)0.0%
Mean20269.5803
Minimum-114086
Maximum2540730
Zeros37132
Zeros (%)1.0%
Negative365
Negative (%)< 0.1%
Memory size29.7 MiB
2024-02-13T20:36:39.268095image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-114086
5-th percentile0
Q13.0305
median12585
Q330446
95-th percentile76163.788
Maximum2540730
Range2654816
Interquartile range (IQR)30442.9695

Descriptive statistics

Standard deviation26002.78165
Coefficient of variation (CV)1.282847561
Kurtosis528.4045432
Mean20269.5803
Median Absolute Deviation (MAD)12585
Skewness6.9851974
Sum3408897475
Variance676144653.7
MonotonicityNot monotonic
2024-02-13T20:36:39.434747image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 37132
 
1.0%
100000 2944
 
0.1%
2 896
 
< 0.1%
12000 808
 
< 0.1%
42640 748
 
< 0.1%
0.2 735
 
< 0.1%
30000 642
 
< 0.1%
20300 539
 
< 0.1%
4 520
 
< 0.1%
40600 499
 
< 0.1%
Other values (57133) 122715
 
3.2%
(Missing) 3719506
95.7%
ValueCountFrequency (%)
-114086 1
< 0.1%
-57634.06 1
< 0.1%
-52432.37 1
< 0.1%
-47822.348 1
< 0.1%
-36996.402 1
< 0.1%
ValueCountFrequency (%)
2540730 1
< 0.1%
519966 1
< 0.1%
428026.25 1
< 0.1%
300000 1
< 0.1%
264004.8 2
< 0.1%

credacc_credlmt_575A
Real number (ℝ)

MISSING  ZEROS 

Distinct37849
Distinct (%)1.0%
Missing119070
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean3254.676597
Minimum0
Maximum400000
Zeros3470494
Zeros (%)89.3%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:39.589751image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile23294
Maximum400000
Range400000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation14061.34948
Coefficient of variation (CV)4.320352288
Kurtosis36.98869737
Mean3254.676597
Median Absolute Deviation (MAD)0
Skewness5.584726264
Sum1.226561979 × 1010
Variance197721549.3
MonotonicityNot monotonic
2024-02-13T20:36:39.756748image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3470494
89.3%
100000 32568
 
0.8%
12000 13694
 
0.4%
20000 6333
 
0.2%
40000 6175
 
0.2%
60000 4714
 
0.1%
24000 3837
 
0.1%
30000 3538
 
0.1%
10000 3177
 
0.1%
42640 2828
 
0.1%
Other values (37839) 221256
 
5.7%
(Missing) 119070
 
3.1%
ValueCountFrequency (%)
0 3470494
89.3%
0.2 1275
 
< 0.1%
0.6 2
 
< 0.1%
3.6000001 2
 
< 0.1%
20 2
 
< 0.1%
ValueCountFrequency (%)
400000 14
< 0.1%
300000 14
< 0.1%
270400 1
 
< 0.1%
245200 1
 
< 0.1%
240000 1
 
< 0.1%

credacc_maxhisbal_375A
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct43046
Distinct (%)25.6%
Missing3719506
Missing (%)95.7%
Infinite0
Infinite (%)0.0%
Mean-3288.887246
Minimum-196108.17
Maximum7988198.5
Zeros98392
Zeros (%)2.5%
Negative32587
Negative (%)0.8%
Memory size29.7 MiB
2024-02-13T20:36:39.926392image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-196108.17
5-th percentile-29255.55
Q10
median0
Q30
95-th percentile588.9267
Maximum7988198.5
Range8184306.67
Interquartile range (IQR)0

Descriptive statistics

Standard deviation28086.11335
Coefficient of variation (CV)-8.539700891
Kurtosis40843.88705
Mean-3288.887246
Median Absolute Deviation (MAD)0
Skewness154.6093224
Sum-553118479.3
Variance788829762.9
MonotonicityNot monotonic
2024-02-13T20:36:40.081718image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 98392
 
2.5%
2 2042
 
0.1%
4 1072
 
< 0.1%
6 560
 
< 0.1%
22 500
 
< 0.1%
10 343
 
< 0.1%
24 298
 
< 0.1%
8 285
 
< 0.1%
20 238
 
< 0.1%
42 213
 
< 0.1%
Other values (43036) 64235
 
1.7%
(Missing) 3719506
95.7%
ValueCountFrequency (%)
-196108.17 2
< 0.1%
-192894.4 1
< 0.1%
-185260 1
< 0.1%
-183642.02 1
< 0.1%
-181545.2 1
< 0.1%
ValueCountFrequency (%)
7988198.5 1
< 0.1%
3556000 1
< 0.1%
1900000 2
< 0.1%
1600000 1
< 0.1%
940000 1
< 0.1%

credacc_minhisbal_90A
Real number (ℝ)

MISSING  ZEROS 

Distinct42878
Distinct (%)25.5%
Missing3719506
Missing (%)95.7%
Infinite0
Infinite (%)0.0%
Mean-6554.784203
Minimum-206808.17
Maximum199567
Zeros100911
Zeros (%)2.6%
Negative43646
Negative (%)1.1%
Memory size29.7 MiB
2024-02-13T20:36:40.230793image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-206808.17
5-th percentile-40000
Q1-1042.079025
median0
Q30
95-th percentile55.8066
Maximum199567
Range406375.17
Interquartile range (IQR)1042.079025

Descriptive statistics

Standard deviation16888.86244
Coefficient of variation (CV)-2.576570321
Kurtosis17.9638962
Mean-6554.784203
Median Absolute Deviation (MAD)0
Skewness-3.715506656
Sum-1102370498
Variance285233674.5
MonotonicityNot monotonic
2024-02-13T20:36:40.395451image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 100911
 
2.6%
2 1383
 
< 0.1%
4 778
 
< 0.1%
6 427
 
< 0.1%
10 315
 
< 0.1%
22 309
 
< 0.1%
-10 242
 
< 0.1%
24 216
 
< 0.1%
8 210
 
< 0.1%
20 208
 
< 0.1%
Other values (42868) 63179
 
1.6%
(Missing) 3719506
95.7%
ValueCountFrequency (%)
-206808.17 2
< 0.1%
-200000 1
< 0.1%
-199998 1
< 0.1%
-199996 1
< 0.1%
-199994.4 1
< 0.1%
ValueCountFrequency (%)
199567 2
< 0.1%
100000 1
< 0.1%
89859.59 1
< 0.1%
79000 1
< 0.1%
70000 1
< 0.1%

credacc_status_367L
Text

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing3719506
Missing (%)95.7%
Memory size29.7 MiB
2024-02-13T20:36:40.533358image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.010262936
Min length2

Characters and Unicode

Total characters338082
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCL
2nd rowCL
3rd rowCL
4th rowCL
5th rowAC
ValueCountFrequency (%)
ac 90216
53.6%
cl 61855
36.8%
ca 14052
 
8.4%
pcl 1726
 
1.0%
po 282
 
0.2%
cr 47
 
< 0.1%
2024-02-13T20:36:40.785100image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 167896
49.7%
A 104268
30.8%
L 63581
 
18.8%
P 2008
 
0.6%
O 282
 
0.1%
R 47
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 338082
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 167896
49.7%
A 104268
30.8%
L 63581
 
18.8%
P 2008
 
0.6%
O 282
 
0.1%
R 47
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 338082
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 167896
49.7%
A 104268
30.8%
L 63581
 
18.8%
P 2008
 
0.6%
O 282
 
0.1%
R 47
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 338082
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 167896
49.7%
A 104268
30.8%
L 63581
 
18.8%
P 2008
 
0.6%
O 282
 
0.1%
R 47
 
< 0.1%

credacc_transactions_402L
Real number (ℝ)

MISSING  ZEROS 

Distinct86
Distinct (%)0.1%
Missing3719506
Missing (%)95.7%
Infinite0
Infinite (%)0.0%
Mean0.5221253672
Minimum0
Maximum155
Zeros150233
Zeros (%)3.9%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:40.949076image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum155
Range155
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.951672329
Coefficient of variation (CV)5.653186982
Kurtosis344.9259013
Mean0.5221253672
Median Absolute Deviation (MAD)0
Skewness14.1469766
Sum87810
Variance8.712369536
MonotonicityNot monotonic
2024-02-13T20:36:41.318843image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 150233
 
3.9%
1 6693
 
0.2%
2 2853
 
0.1%
3 1893
 
< 0.1%
4 1252
 
< 0.1%
5 931
 
< 0.1%
6 730
 
< 0.1%
7 552
 
< 0.1%
8 411
 
< 0.1%
9 387
 
< 0.1%
Other values (76) 2243
 
0.1%
(Missing) 3719506
95.7%
ValueCountFrequency (%)
0 150233
3.9%
1 6693
 
0.2%
2 2853
 
0.1%
3 1893
 
< 0.1%
4 1252
 
< 0.1%
ValueCountFrequency (%)
155 2
< 0.1%
135 1
< 0.1%
126 1
< 0.1%
123 1
< 0.1%
110 1
< 0.1%

credamount_590A
Real number (ℝ)

MISSING  ZEROS 

Distinct201793
Distinct (%)5.4%
Missing123329
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean38657.85509
Minimum0
Maximum715392
Zeros42592
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:41.476389image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5022
Q113998
median27000
Q350000
95-th percentile100000
Maximum715392
Range715392
Interquartile range (IQR)36002

Descriptive statistics

Standard deviation37544.33619
Coefficient of variation (CV)0.9711955334
Kurtosis10.0483708
Mean38657.85509
Median Absolute Deviation (MAD)15020
Skewness2.478930874
Sum1.455218901 × 1011
Variance1409577180
MonotonicityNot monotonic
2024-02-13T20:36:41.634461image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60000 186103
 
4.8%
100000 175966
 
4.5%
40000 174743
 
4.5%
20000 170088
 
4.4%
30000 140432
 
3.6%
50000 63461
 
1.6%
10000 52190
 
1.3%
24000 47919
 
1.2%
80000 43464
 
1.1%
0 42592
 
1.1%
Other values (201783) 2667397
68.6%
(Missing) 123329
 
3.2%
ValueCountFrequency (%)
0 42592
1.1%
0.2 675
 
< 0.1%
0.6 1
 
< 0.1%
3.6000001 1
 
< 0.1%
20 1
 
< 0.1%
ValueCountFrequency (%)
715392 2
< 0.1%
550000 1
< 0.1%
501422.22 2
< 0.1%
493000 1
< 0.1%
480665.1 2
< 0.1%

credtype_587L
Text

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing123329
Missing (%)3.2%
Memory size29.7 MiB
2024-02-13T20:36:41.779386image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters11293065
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCAL
2nd rowCAL
3rd rowCAL
4th rowCAL
5th rowCAL
ValueCountFrequency (%)
col 2035876
54.1%
cal 1478343
39.3%
rel 250136
 
6.6%
2024-02-13T20:36:42.040374image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 3764355
33.3%
C 3514219
31.1%
O 2035876
18.0%
A 1478343
 
13.1%
R 250136
 
2.2%
E 250136
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11293065
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 3764355
33.3%
C 3514219
31.1%
O 2035876
18.0%
A 1478343
 
13.1%
R 250136
 
2.2%
E 250136
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 11293065
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 3764355
33.3%
C 3514219
31.1%
O 2035876
18.0%
A 1478343
 
13.1%
R 250136
 
2.2%
E 250136
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11293065
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 3764355
33.3%
C 3514219
31.1%
O 2035876
18.0%
A 1478343
 
13.1%
R 250136
 
2.2%
E 250136
 
2.2%

currdebt_94A
Real number (ℝ)

MISSING  ZEROS 

Distinct346082
Distinct (%)13.2%
Missing1270377
Missing (%)32.7%
Infinite0
Infinite (%)0.0%
Mean5229.100749
Minimum0
Maximum507429.72
Zeros2238778
Zeros (%)57.6%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:42.198400image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile35505.071
Maximum507429.72
Range507429.72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation19278.42258
Coefficient of variation (CV)3.686756769
Kurtosis47.65318341
Mean5229.100749
Median Absolute Deviation (MAD)0
Skewness5.85585581
Sum1.368616199 × 1010
Variance371657577.3
MonotonicityNot monotonic
2024-02-13T20:36:42.355194image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2238778
57.6%
19998 91
 
< 0.1%
100000 91
 
< 0.1%
20000 85
 
< 0.1%
11998 84
 
< 0.1%
17998 82
 
< 0.1%
7998 80
 
< 0.1%
40000 77
 
< 0.1%
15998 75
 
< 0.1%
13998 75
 
< 0.1%
Other values (346072) 377789
 
9.7%
(Missing) 1270377
32.7%
ValueCountFrequency (%)
0 2238778
57.6%
0.002 1
 
< 0.1%
0.010000001 1
 
< 0.1%
0.020000001 1
 
< 0.1%
0.048 2
 
< 0.1%
ValueCountFrequency (%)
507429.72 1
< 0.1%
507040.06 1
< 0.1%
491492.1 1
< 0.1%
490718.6 1
< 0.1%
489944.25 1
< 0.1%

dateactivated_425D
Text

MISSING 

Distinct3939
Distinct (%)0.2%
Missing1844702
Missing (%)47.4%
Memory size29.7 MiB
2024-02-13T20:36:42.741639image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20429820
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)< 0.1%

Sample

1st row2018-10-19
2nd row2018-11-07
3rd row2018-12-28
4th row2018-10-09
5th row2018-11-16
ValueCountFrequency (%)
2019-01-31 3811
 
0.2%
2018-10-17 3053
 
0.1%
2018-10-04 3022
 
0.1%
2019-01-11 2915
 
0.1%
2018-05-29 2894
 
0.1%
2019-01-09 2885
 
0.1%
2019-01-10 2842
 
0.1%
2018-11-14 2817
 
0.1%
2019-01-30 2805
 
0.1%
2018-09-12 2780
 
0.1%
Other values (3929) 2013158
98.5%
2024-02-13T20:36:43.276218image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4702739
23.0%
- 4085964
20.0%
1 3822098
18.7%
2 3324977
16.3%
8 938945
 
4.6%
7 742863
 
3.6%
9 664418
 
3.3%
6 587810
 
2.9%
3 569050
 
2.8%
5 503414
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16343856
80.0%
Dash Punctuation 4085964
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4702739
28.8%
1 3822098
23.4%
2 3324977
20.3%
8 938945
 
5.7%
7 742863
 
4.5%
9 664418
 
4.1%
6 587810
 
3.6%
3 569050
 
3.5%
5 503414
 
3.1%
4 487542
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 4085964
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20429820
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4702739
23.0%
- 4085964
20.0%
1 3822098
18.7%
2 3324977
16.3%
8 938945
 
4.6%
7 742863
 
3.6%
9 664418
 
3.3%
6 587810
 
2.9%
3 569050
 
2.8%
5 503414
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20429820
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4702739
23.0%
- 4085964
20.0%
1 3822098
18.7%
2 3324977
16.3%
8 938945
 
4.6%
7 742863
 
3.6%
9 664418
 
3.3%
6 587810
 
2.9%
3 569050
 
2.8%
5 503414
 
2.5%
Distinct479
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:43.695028image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.43327107
Min length8

Characters and Unicode

Total characters40561261
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)< 0.1%

Sample

1st rowP136_108_173
2nd rowP136_108_173
3rd rowP131_33_167
4th rowP194_82_174
5th rowP54_133_26
ValueCountFrequency (%)
a55475b1 388961
 
10.0%
p131_33_167 190215
 
4.9%
p123_6_84 160709
 
4.1%
p197_47_166 137010
 
3.5%
p204_99_158 114381
 
2.9%
p98_137_111 94298
 
2.4%
p62_144_102 86540
 
2.2%
p159_143_123 85401
 
2.2%
p147_21_170 80957
 
2.1%
p178_112_160 71042
 
1.8%
Other values (469) 2478170
63.7%
2024-02-13T20:36:44.254810image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 8872392
21.9%
_ 6997446
17.3%
P 3498722
 
8.6%
7 3069372
 
7.6%
5 3022413
 
7.5%
4 2644775
 
6.5%
6 2218257
 
5.5%
3 2204920
 
5.4%
2 2070115
 
5.1%
9 1904803
 
4.7%
Other values (8) 4058046
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29287166
72.2%
Connector Punctuation 6997446
 
17.3%
Uppercase Letter 3498723
 
8.6%
Lowercase Letter 777926
 
1.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8872392
30.3%
7 3069372
 
10.5%
5 3022413
 
10.3%
4 2644775
 
9.0%
6 2218257
 
7.6%
3 2204920
 
7.5%
2 2070115
 
7.1%
9 1904803
 
6.5%
8 1893444
 
6.5%
0 1386675
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
a 388961
50.0%
b 388961
50.0%
e 2
 
< 0.1%
m 1
 
< 0.1%
t 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 3498722
> 99.9%
Q 1
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 6997446
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36284612
89.5%
Latin 4276649
 
10.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8872392
24.5%
_ 6997446
19.3%
7 3069372
 
8.5%
5 3022413
 
8.3%
4 2644775
 
7.3%
6 2218257
 
6.1%
3 2204920
 
6.1%
2 2070115
 
5.7%
9 1904803
 
5.2%
8 1893444
 
5.2%
Latin
ValueCountFrequency (%)
P 3498722
81.8%
a 388961
 
9.1%
b 388961
 
9.1%
e 2
 
< 0.1%
Q 1
 
< 0.1%
m 1
 
< 0.1%
t 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40561261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8872392
21.9%
_ 6997446
17.3%
P 3498722
 
8.6%
7 3069372
 
7.6%
5 3022413
 
7.5%
4 2644775
 
6.5%
6 2218257
 
5.5%
3 2204920
 
5.4%
2 2070115
 
5.1%
9 1904803
 
4.7%
Other values (8) 4058046
10.0%

downpmt_134A
Real number (ℝ)

MISSING  ZEROS 

Distinct18720
Distinct (%)0.5%
Missing123329
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean457.8898882
Minimum0
Maximum320400
Zeros3381858
Zeros (%)87.0%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:44.419784image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2000
Maximum320400
Range320400
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2697.225764
Coefficient of variation (CV)5.890555423
Kurtosis776.6842527
Mean457.8898882
Median Absolute Deviation (MAD)0
Skewness17.93615961
Sum1723660090
Variance7275026.82
MonotonicityNot monotonic
2024-02-13T20:36:44.573153image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3381858
87.0%
2000 44574
 
1.1%
4000 29703
 
0.8%
1000 28357
 
0.7%
6000 16262
 
0.4%
200 13639
 
0.4%
10000 13472
 
0.3%
400 12662
 
0.3%
3000 11986
 
0.3%
8000 8796
 
0.2%
Other values (18710) 203046
 
5.2%
(Missing) 123329
 
3.2%
ValueCountFrequency (%)
0 3381858
87.0%
0.2 390
 
< 0.1%
0.4 48
 
< 0.1%
0.6 124
 
< 0.1%
0.8 33
 
< 0.1%
ValueCountFrequency (%)
320400 1
 
< 0.1%
305200 1
 
< 0.1%
300000 2
< 0.1%
274998 3
< 0.1%
268134.4 1
 
< 0.1%

dtlastpmt_581D
Text

MISSING 

Distinct2150
Distinct (%)0.2%
Missing2860375
Missing (%)73.6%
Memory size29.7 MiB
2024-02-13T20:36:44.872239image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10273090
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)< 0.1%

Sample

1st row2019-01-10
2nd row2019-01-03
3rd row2019-01-08
4th row2018-12-05
5th row2018-12-26
ValueCountFrequency (%)
2019-09-16 22398
 
2.2%
2018-12-24 2210
 
0.2%
2019-01-11 2176
 
0.2%
2018-07-23 2052
 
0.2%
2018-12-20 2012
 
0.2%
2019-01-02 1982
 
0.2%
2019-02-25 1939
 
0.2%
2018-05-24 1929
 
0.2%
2019-01-22 1922
 
0.2%
2018-12-25 1916
 
0.2%
Other values (2140) 986773
96.1%
2024-02-13T20:36:45.335518image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2267061
22.1%
- 2054618
20.0%
1 1871415
18.2%
2 1671458
16.3%
8 544478
 
5.3%
9 486902
 
4.7%
7 418414
 
4.1%
6 361158
 
3.5%
3 229774
 
2.2%
5 196197
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8218472
80.0%
Dash Punctuation 2054618
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2267061
27.6%
1 1871415
22.8%
2 1671458
20.3%
8 544478
 
6.6%
9 486902
 
5.9%
7 418414
 
5.1%
6 361158
 
4.4%
3 229774
 
2.8%
5 196197
 
2.4%
4 171615
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 2054618
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10273090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2267061
22.1%
- 2054618
20.0%
1 1871415
18.2%
2 1671458
16.3%
8 544478
 
5.3%
9 486902
 
4.7%
7 418414
 
4.1%
6 361158
 
3.5%
3 229774
 
2.2%
5 196197
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10273090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2267061
22.1%
- 2054618
20.0%
1 1871415
18.2%
2 1671458
16.3%
8 544478
 
5.3%
9 486902
 
4.7%
7 418414
 
4.1%
6 361158
 
3.5%
3 229774
 
2.2%
5 196197
 
1.9%
Distinct2168
Distinct (%)0.1%
Missing2434155
Missing (%)62.6%
Memory size29.7 MiB
2024-02-13T20:36:45.686268image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters14535290
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique224 ?
Unique (%)< 0.1%

Sample

1st row2019-01-10
2nd row2018-12-29
3rd row2019-01-03
4th row2019-01-08
5th row2019-01-05
ValueCountFrequency (%)
2019-09-16 24576
 
1.7%
2019-05-22 4054
 
0.3%
2019-05-20 3990
 
0.3%
2019-03-25 3786
 
0.3%
2019-01-22 3759
 
0.3%
2019-05-27 3685
 
0.3%
2019-07-19 3601
 
0.2%
2019-02-25 3591
 
0.2%
2019-03-19 3582
 
0.2%
2019-03-20 3565
 
0.2%
Other values (2158) 1395340
96.0%
2024-02-13T20:36:46.145001image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3193320
22.0%
- 2907058
20.0%
1 2662034
18.3%
2 2367467
16.3%
9 969878
 
6.7%
8 655291
 
4.5%
7 507252
 
3.5%
6 443031
 
3.0%
3 318089
 
2.2%
5 272558
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11628232
80.0%
Dash Punctuation 2907058
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3193320
27.5%
1 2662034
22.9%
2 2367467
20.4%
9 969878
 
8.3%
8 655291
 
5.6%
7 507252
 
4.4%
6 443031
 
3.8%
3 318089
 
2.7%
5 272558
 
2.3%
4 239312
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 2907058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14535290
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3193320
22.0%
- 2907058
20.0%
1 2662034
18.3%
2 2367467
16.3%
9 969878
 
6.7%
8 655291
 
4.5%
7 507252
 
3.5%
6 443031
 
3.0%
3 318089
 
2.2%
5 272558
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14535290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3193320
22.0%
- 2907058
20.0%
1 2662034
18.3%
2 2367467
16.3%
9 969878
 
6.7%
8 655291
 
4.5%
7 507252
 
3.5%
6 443031
 
3.0%
3 318089
 
2.2%
5 272558
 
1.9%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:46.337003image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length9.311229771
Min length8

Characters and Unicode

Total characters36199119
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP97_36_170
2nd rowP97_36_170
3rd rowP97_36_170
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 1693666
43.6%
p97_36_170 1458441
37.5%
p33_146_175 680819
17.5%
p106_81_188 26860
 
0.7%
p17_36_170 25966
 
0.7%
p157_18_172 1932
 
< 0.1%
2024-02-13T20:36:46.636804image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 5763749
15.9%
7 5347163
14.8%
1 4652053
12.9%
_ 4388036
12.1%
3 2846045
7.9%
4 2374485
6.6%
P 2194018
 
6.1%
6 2192086
 
6.1%
a 1693666
 
4.7%
b 1693666
 
4.7%
Other values (4) 3054152
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26229733
72.5%
Connector Punctuation 4388036
 
12.1%
Lowercase Letter 3387332
 
9.4%
Uppercase Letter 2194018
 
6.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 5763749
22.0%
7 5347163
20.4%
1 4652053
17.7%
3 2846045
10.9%
4 2374485
9.1%
6 2192086
 
8.4%
0 1511267
 
5.8%
9 1458441
 
5.6%
8 82512
 
0.3%
2 1932
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 1693666
50.0%
b 1693666
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4388036
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 2194018
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30617769
84.6%
Latin 5581350
 
15.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 5763749
18.8%
7 5347163
17.5%
1 4652053
15.2%
_ 4388036
14.3%
3 2846045
9.3%
4 2374485
7.8%
6 2192086
 
7.2%
0 1511267
 
4.9%
9 1458441
 
4.8%
8 82512
 
0.3%
Latin
ValueCountFrequency (%)
P 2194018
39.3%
a 1693666
30.3%
b 1693666
30.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36199119
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 5763749
15.9%
7 5347163
14.8%
1 4652053
12.9%
_ 4388036
12.1%
3 2846045
7.9%
4 2374485
6.6%
P 2194018
 
6.1%
6 2192086
 
6.1%
a 1693666
 
4.7%
b 1693666
 
4.7%
Other values (4) 3054152
8.4%

employedfrom_700D
Text

MISSING 

Distinct9285
Distinct (%)0.5%
Missing2180869
Missing (%)56.1%
Memory size29.7 MiB
2024-02-13T20:36:47.013180image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters17068150
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1902 ?
Unique (%)0.1%

Sample

1st row2010-02-15
2nd row2010-02-15
3rd row2018-05-15
4th row2013-09-15
5th row2012-09-15
ValueCountFrequency (%)
2015-01-15 29305
 
1.7%
2016-01-15 28713
 
1.7%
2014-01-15 28642
 
1.7%
2017-01-15 28477
 
1.7%
2013-01-15 28312
 
1.7%
2012-01-15 24185
 
1.4%
2010-01-15 18632
 
1.1%
2011-09-15 17503
 
1.0%
2010-09-15 17495
 
1.0%
2012-09-15 17356
 
1.0%
Other values (9275) 1468195
86.0%
2024-02-13T20:36:47.560578image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3897011
22.8%
1 3611856
21.2%
- 3413630
20.0%
2 1980481
11.6%
5 1922389
11.3%
9 677460
 
4.0%
8 341301
 
2.0%
6 318819
 
1.9%
3 311076
 
1.8%
4 305810
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13654520
80.0%
Dash Punctuation 3413630
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3897011
28.5%
1 3611856
26.5%
2 1980481
14.5%
5 1922389
14.1%
9 677460
 
5.0%
8 341301
 
2.5%
6 318819
 
2.3%
3 311076
 
2.3%
4 305810
 
2.2%
7 288317
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 3413630
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17068150
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3897011
22.8%
1 3611856
21.2%
- 3413630
20.0%
2 1980481
11.6%
5 1922389
11.3%
9 677460
 
4.0%
8 341301
 
2.0%
6 318819
 
1.9%
3 311076
 
1.8%
4 305810
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17068150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3897011
22.8%
1 3611856
21.2%
- 3413630
20.0%
2 1980481
11.6%
5 1922389
11.3%
9 677460
 
4.0%
8 341301
 
2.0%
6 318819
 
1.9%
3 311076
 
1.8%
4 305810
 
1.8%

familystate_726L
Text

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing1245201
Missing (%)32.0%
Memory size29.7 MiB
2024-02-13T20:36:47.737805image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length19
Median length7
Mean length7.068848882
Min length6

Characters and Unicode

Total characters18679313
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSINGLE
2nd rowSINGLE
3rd rowMARRIED
4th rowSINGLE
5th rowSINGLE
ValueCountFrequency (%)
married 1904175
72.1%
single 391421
 
14.8%
widowed 220519
 
8.3%
divorced 85733
 
3.2%
living_with_partner 40635
 
1.5%
2024-02-13T20:36:48.024709image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 3975353
21.3%
I 2723753
14.6%
E 2642483
14.1%
D 2516679
13.5%
A 1944810
10.4%
M 1904175
10.2%
W 481673
 
2.6%
N 472691
 
2.5%
L 432056
 
2.3%
G 432056
 
2.3%
Other values (8) 1153584
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 18598043
99.6%
Connector Punctuation 81270
 
0.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3975353
21.4%
I 2723753
14.6%
E 2642483
14.2%
D 2516679
13.5%
A 1944810
10.5%
M 1904175
10.2%
W 481673
 
2.6%
N 472691
 
2.5%
L 432056
 
2.3%
G 432056
 
2.3%
Other values (7) 1072314
 
5.8%
Connector Punctuation
ValueCountFrequency (%)
_ 81270
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18598043
99.6%
Common 81270
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3975353
21.4%
I 2723753
14.6%
E 2642483
14.2%
D 2516679
13.5%
A 1944810
10.5%
M 1904175
10.2%
W 481673
 
2.6%
N 472691
 
2.5%
L 432056
 
2.3%
G 432056
 
2.3%
Other values (7) 1072314
 
5.8%
Common
ValueCountFrequency (%)
_ 81270
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18679313
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 3975353
21.3%
I 2723753
14.6%
E 2642483
14.1%
D 2516679
13.5%
A 1944810
10.4%
M 1904175
10.2%
W 481673
 
2.6%
N 472691
 
2.5%
L 432056
 
2.3%
G 432056
 
2.3%
Other values (8) 1153584
 
6.2%
Distinct4886
Distinct (%)0.1%
Missing365175
Missing (%)9.4%
Memory size29.7 MiB
2024-02-13T20:36:48.400747image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters35225090
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st row2013-05-04
2nd row2013-05-04
3rd row2019-02-07
4th row2019-02-08
5th row2018-10-12
ValueCountFrequency (%)
2019-03-14 8057
 
0.2%
2018-12-15 6898
 
0.2%
2018-09-15 6596
 
0.2%
2018-02-15 6117
 
0.2%
2019-02-11 6078
 
0.2%
2018-03-14 5897
 
0.2%
2018-02-11 5858
 
0.2%
2018-07-15 5657
 
0.2%
2019-02-15 5602
 
0.2%
2017-12-15 5221
 
0.1%
Other values (4876) 3460528
98.2%
2024-02-13T20:36:48.933354image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8125548
23.1%
- 7045018
20.0%
1 6501171
18.5%
2 5801762
16.5%
8 1472398
 
4.2%
9 1257862
 
3.6%
7 1174381
 
3.3%
5 1090650
 
3.1%
3 972274
 
2.8%
6 949158
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28180072
80.0%
Dash Punctuation 7045018
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8125548
28.8%
1 6501171
23.1%
2 5801762
20.6%
8 1472398
 
5.2%
9 1257862
 
4.5%
7 1174381
 
4.2%
5 1090650
 
3.9%
3 972274
 
3.5%
6 949158
 
3.4%
4 834868
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 7045018
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35225090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8125548
23.1%
- 7045018
20.0%
1 6501171
18.5%
2 5801762
16.5%
8 1472398
 
4.2%
9 1257862
 
3.6%
7 1174381
 
3.3%
5 1090650
 
3.1%
3 972274
 
2.8%
6 949158
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35225090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8125548
23.1%
- 7045018
20.0%
1 6501171
18.5%
2 5801762
16.5%
8 1472398
 
4.2%
9 1257862
 
3.6%
7 1174381
 
3.3%
5 1090650
 
3.1%
3 972274
 
2.8%
6 949158
 
2.7%
Distinct3
Distinct (%)< 0.1%
Missing123329
Missing (%)3.2%
Memory size29.7 MiB
2024-02-13T20:36:49.072875image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.392895197
Min length3

Characters and Unicode

Total characters12772062
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCASH
2nd rowCASH
3rd rowCASH
4th rowCASH
5th rowCASH
ValueCountFrequency (%)
pos 2167053
57.6%
cash 1478997
39.3%
ndf 118305
 
3.1%
2024-02-13T20:36:49.322600image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 3646050
28.5%
P 2167053
17.0%
O 2167053
17.0%
C 1478997
11.6%
A 1478997
11.6%
H 1478997
11.6%
N 118305
 
0.9%
D 118305
 
0.9%
F 118305
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12772062
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 3646050
28.5%
P 2167053
17.0%
O 2167053
17.0%
C 1478997
11.6%
A 1478997
11.6%
H 1478997
11.6%
N 118305
 
0.9%
D 118305
 
0.9%
F 118305
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 12772062
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 3646050
28.5%
P 2167053
17.0%
O 2167053
17.0%
C 1478997
11.6%
A 1478997
11.6%
H 1478997
11.6%
N 118305
 
0.9%
D 118305
 
0.9%
F 118305
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12772062
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 3646050
28.5%
P 2167053
17.0%
O 2167053
17.0%
C 1478997
11.6%
A 1478997
11.6%
H 1478997
11.6%
N 118305
 
0.9%
D 118305
 
0.9%
F 118305
 
0.9%

isbidproduct_390L
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing35
Missing (%)< 0.1%
Memory size29.7 MiB
False
3668816 
True
 
218833
(Missing)
 
35
ValueCountFrequency (%)
False 3668816
94.4%
True 218833
 
5.6%
(Missing) 35
 
< 0.1%
2024-02-13T20:36:49.463600image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

isdebitcard_527L
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing3637550
Missing (%)93.6%
Memory size29.7 MiB
False
 
212706
True
 
37428
(Missing)
3637550 
ValueCountFrequency (%)
False 212706
 
5.5%
True 37428
 
1.0%
(Missing) 3637550
93.6%
2024-02-13T20:36:49.574391image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

mainoccupationinc_437A
Real number (ℝ)

Distinct21957
Distinct (%)0.6%
Missing36612
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean40029.69379
Minimum0
Maximum196000
Zeros47
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:49.710541image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6000
Q118000
median34000
Q352000
95-th percentile100000
Maximum196000
Range196000
Interquartile range (IQR)34000

Descriptive statistics

Standard deviation31396.36665
Coefficient of variation (CV)0.7843269254
Kurtosis5.201732139
Mean40029.69379
Median Absolute Deviation (MAD)17000
Skewness1.863363842
Sum1.541572329 × 1011
Variance985731838.9
MonotonicityNot monotonic
2024-02-13T20:36:49.873086image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30000 278622
 
7.2%
40000 272284
 
7.0%
50000 230602
 
5.9%
60000 184523
 
4.7%
20000 145935
 
3.8%
24000 124888
 
3.2%
70000 114524
 
2.9%
36000 113091
 
2.9%
16000 74780
 
1.9%
12000 69980
 
1.8%
Other values (21947) 2241843
57.7%
ValueCountFrequency (%)
0 47
< 0.1%
0.038 1
 
< 0.1%
0.2 78
< 0.1%
0.4 6
 
< 0.1%
0.6 6
 
< 0.1%
ValueCountFrequency (%)
196000 19426
0.5%
195800 4
 
< 0.1%
195600 16
 
< 0.1%
195540 1
 
< 0.1%
195400 1
 
< 0.1%

maxdpdtolerance_577P
Real number (ℝ)

MISSING  ZEROS 

Distinct3199
Distinct (%)0.2%
Missing1817378
Missing (%)46.7%
Infinite0
Infinite (%)0.0%
Mean13.26581674
Minimum0
Maximum4058
Zeros1527003
Zeros (%)39.3%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:50.032768image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile15
Maximum4058
Range4058
Interquartile range (IQR)1

Descriptive statistics

Standard deviation134.8893241
Coefficient of variation (CV)10.16818841
Kurtosis328.0205761
Mean13.26581674
Median Absolute Deviation (MAD)0
Skewness16.64564203
Sum27464300
Variance18195.12975
MonotonicityNot monotonic
2024-02-13T20:36:50.194802image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1527003
39.3%
1 283391
 
7.3%
5 35934
 
0.9%
6 28382
 
0.7%
10 21438
 
0.6%
4 11330
 
0.3%
9 11187
 
0.3%
14 10079
 
0.3%
7 9775
 
0.3%
18 9442
 
0.2%
Other values (3189) 122345
 
3.1%
(Missing) 1817378
46.7%
ValueCountFrequency (%)
0 1527003
39.3%
1 283391
 
7.3%
2 8816
 
0.2%
3 2658
 
0.1%
4 11330
 
0.3%
ValueCountFrequency (%)
4058 1
< 0.1%
4025 1
< 0.1%
4024 1
< 0.1%
4000 2
< 0.1%
3999 1
< 0.1%

num_group1
Real number (ℝ)

ZEROS 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.910262254
Minimum0
Maximum19
Zeros782997
Zeros (%)20.1%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:50.333781image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile13
Maximum19
Range19
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.101353263
Coefficient of variation (CV)1.048869103
Kurtosis1.58474303
Mean3.910262254
Median Absolute Deviation (MAD)2
Skewness1.393787504
Sum15201864
Variance16.82109859
MonotonicityNot monotonic
2024-02-13T20:36:50.457514image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 782997
20.1%
1 622546
16.0%
2 494850
12.7%
3 394066
10.1%
4 314724
8.1%
5 251725
 
6.5%
6 202039
 
5.2%
7 162746
 
4.2%
8 131605
 
3.4%
9 106507
 
2.7%
Other values (10) 423879
10.9%
ValueCountFrequency (%)
0 782997
20.1%
1 622546
16.0%
2 494850
12.7%
3 394066
10.1%
4 314724
8.1%
ValueCountFrequency (%)
19 16520
0.4%
18 19472
0.5%
17 23111
0.6%
16 27575
0.7%
15 33102
0.9%

outstandingdebt_522A
Real number (ℝ)

MISSING  ZEROS 

Distinct269095
Distinct (%)10.3%
Missing1277922
Missing (%)32.9%
Infinite0
Infinite (%)0.0%
Mean6994.587846
Minimum0
Maximum1210629.1
Zeros2229538
Zeros (%)57.3%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:50.615513image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile41496.3719
Maximum1210629.1
Range1210629.1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation29162.48428
Coefficient of variation (CV)4.169292734
Kurtosis88.30188385
Mean6994.587846
Median Absolute Deviation (MAD)0
Skewness7.63303055
Sum1.825420957 × 1010
Variance850450489.7
MonotonicityNot monotonic
2024-02-13T20:36:50.780513image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2229538
57.3%
10 188
 
< 0.1%
19998 82
 
< 0.1%
17998 75
 
< 0.1%
11998 73
 
< 0.1%
7998 72
 
< 0.1%
15998 66
 
< 0.1%
14998 66
 
< 0.1%
9978 64
 
< 0.1%
13998 64
 
< 0.1%
Other values (269085) 379474
 
9.8%
(Missing) 1277922
32.9%
ValueCountFrequency (%)
0 2229538
57.3%
0.002 3
 
< 0.1%
0.004 1
 
< 0.1%
0.008 1
 
< 0.1%
0.010000001 1
 
< 0.1%
ValueCountFrequency (%)
1210629.1 1
< 0.1%
1192100.9 1
< 0.1%
1092393 1
< 0.1%
1085048.1 1
< 0.1%
1071760.9 1
< 0.1%

pmtnum_8L
Real number (ℝ)

MISSING 

Distinct57
Distinct (%)< 0.1%
Missing312833
Missing (%)8.0%
Infinite0
Infinite (%)0.0%
Mean15.78210253
Minimum3
Maximum62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:50.931023image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q16
median12
Q324
95-th percentile36
Maximum62
Range59
Interquartile range (IQR)18

Descriptive statistics

Standard deviation10.46206069
Coefficient of variation (CV)0.662906648
Kurtosis1.294429223
Mean15.78210253
Median Absolute Deviation (MAD)6
Skewness1.219734053
Sum56418665
Variance109.4547138
MonotonicityNot monotonic
2024-02-13T20:36:51.075821image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 883553
22.7%
6 589676
15.2%
24 562114
14.5%
18 296861
 
7.6%
36 190785
 
4.9%
3 188014
 
4.8%
16 133452
 
3.4%
48 102226
 
2.6%
9 78979
 
2.0%
10 75231
 
1.9%
Other values (47) 473960
12.2%
(Missing) 312833
 
8.0%
ValueCountFrequency (%)
3 188014
 
4.8%
4 73219
 
1.9%
5 43429
 
1.1%
6 589676
15.2%
7 7541
 
0.2%
ValueCountFrequency (%)
62 1
 
< 0.1%
61 3
 
< 0.1%
60 3148
0.1%
58 46
 
< 0.1%
56 23
 
< 0.1%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:51.235047image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.092561535
Min length8

Characters and Unicode

Total characters31461322
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 3779480
97.2%
p177_117_192 57712
 
1.5%
p46_145_78 24034
 
0.6%
p149_40_170 11308
 
0.3%
p60_146_156 8054
 
0.2%
p67_102_161 4470
 
0.1%
p217_110_186 1560
 
< 0.1%
p169_115_83 848
 
< 0.1%
p140_48_169 218
 
< 0.1%
2024-02-13T20:36:51.546391image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 11371376
36.1%
1 4095716
 
13.0%
7 3993988
 
12.7%
4 3858654
 
12.3%
a 3779480
 
12.0%
b 3779480
 
12.0%
_ 216408
 
0.7%
P 108204
 
0.3%
9 70086
 
0.2%
2 63742
 
0.2%
Other values (4) 124188
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23577750
74.9%
Lowercase Letter 7558960
 
24.0%
Connector Punctuation 216408
 
0.7%
Uppercase Letter 108204
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 11371376
48.2%
1 4095716
 
17.4%
7 3993988
 
16.9%
4 3858654
 
16.4%
9 70086
 
0.3%
2 63742
 
0.3%
6 59762
 
0.3%
0 36918
 
0.2%
8 26660
 
0.1%
3 848
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 3779480
50.0%
b 3779480
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 216408
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 108204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23794158
75.6%
Latin 7667164
 
24.4%

Most frequent character per script

Common
ValueCountFrequency (%)
5 11371376
47.8%
1 4095716
 
17.2%
7 3993988
 
16.8%
4 3858654
 
16.2%
_ 216408
 
0.9%
9 70086
 
0.3%
2 63742
 
0.3%
6 59762
 
0.3%
0 36918
 
0.2%
8 26660
 
0.1%
Latin
ValueCountFrequency (%)
a 3779480
49.3%
b 3779480
49.3%
P 108204
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31461322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 11371376
36.1%
1 4095716
 
13.0%
7 3993988
 
12.7%
4 3858654
 
12.3%
a 3779480
 
12.0%
b 3779480
 
12.0%
_ 216408
 
0.7%
P 108204
 
0.3%
9 70086
 
0.2%
2 63742
 
0.2%
Other values (4) 124188
 
0.4%
Distinct9028
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:51.889348image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length17
Median length8
Mean length8.031521852
Min length7

Characters and Unicode

Total characters31224019
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5636 ?
Unique (%)0.1%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 3839511
98.8%
p46_72_80 1281
 
< 0.1%
p104_137_180 959
 
< 0.1%
p167_22_171 699
 
< 0.1%
p139_125_64 671
 
< 0.1%
p143_116_69 665
 
< 0.1%
p25_111_112 640
 
< 0.1%
p21_76_53 612
 
< 0.1%
p116_59_165 532
 
< 0.1%
p103_114_185 526
 
< 0.1%
Other values (9018) 41588
 
1.1%
2024-02-13T20:36:52.352900image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 11548196
37.0%
1 3943241
 
12.6%
7 3873726
 
12.4%
4 3869256
 
12.4%
a 3839535
 
12.3%
b 3839515
 
12.3%
_ 96346
 
0.3%
P 48092
 
0.2%
2 34187
 
0.1%
6 33655
 
0.1%
Other values (26) 98270
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23400120
74.9%
Lowercase Letter 7679380
 
24.6%
Connector Punctuation 96346
 
0.3%
Uppercase Letter 48173
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3839535
50.0%
b 3839515
50.0%
e 46
 
< 0.1%
o 39
 
< 0.1%
r 37
 
< 0.1%
t 35
 
< 0.1%
u 22
 
< 0.1%
s 16
 
< 0.1%
k 16
 
< 0.1%
h 15
 
< 0.1%
Other values (13) 104
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 11548196
49.4%
1 3943241
 
16.9%
7 3873726
 
16.6%
4 3869256
 
16.5%
2 34187
 
0.1%
6 33655
 
0.1%
3 25672
 
0.1%
0 25385
 
0.1%
8 24952
 
0.1%
9 21850
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
P 48092
99.8%
Q 81
 
0.2%
Connector Punctuation
ValueCountFrequency (%)
_ 96346
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23496466
75.3%
Latin 7727553
 
24.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3839535
49.7%
b 3839515
49.7%
P 48092
 
0.6%
Q 81
 
< 0.1%
e 46
 
< 0.1%
o 39
 
< 0.1%
r 37
 
< 0.1%
t 35
 
< 0.1%
u 22
 
< 0.1%
s 16
 
< 0.1%
Other values (15) 135
 
< 0.1%
Common
ValueCountFrequency (%)
5 11548196
49.1%
1 3943241
 
16.8%
7 3873726
 
16.5%
4 3869256
 
16.5%
_ 96346
 
0.4%
2 34187
 
0.1%
6 33655
 
0.1%
3 25672
 
0.1%
0 25385
 
0.1%
8 24952
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31224019
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 11548196
37.0%
1 3943241
 
12.6%
7 3873726
 
12.4%
4 3869256
 
12.4%
a 3839535
 
12.3%
b 3839515
 
12.3%
_ 96346
 
0.3%
P 48092
 
0.2%
2 34187
 
0.1%
6 33655
 
0.1%
Other values (26) 98270
 
0.3%
Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:52.534109image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length8.667440306
Min length8

Characters and Unicode

Total characters33696269
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowP94_109_143
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 2848762
73.3%
p94_109_143 524764
 
13.5%
p99_56_166 358720
 
9.2%
p45_84_106 71489
 
1.8%
p198_131_9 65648
 
1.7%
p30_86_84 5808
 
0.1%
p52_67_90 3027
 
0.1%
p48_22_32 2793
 
0.1%
p196_88_176 2112
 
0.1%
p121_60_164 1636
 
< 0.1%
Other values (8) 2925
 
0.1%
2024-02-13T20:36:52.834097image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 8980061
26.7%
1 4540393
13.5%
4 4053100
12.0%
7 2855155
 
8.5%
a 2848762
 
8.5%
b 2848762
 
8.5%
_ 2077844
 
6.2%
9 1905359
 
5.7%
6 1167599
 
3.5%
P 1038922
 
3.1%
Other values (4) 1380312
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24881979
73.8%
Lowercase Letter 5697524
 
16.9%
Connector Punctuation 2077844
 
6.2%
Uppercase Letter 1038922
 
3.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 8980061
36.1%
1 4540393
18.2%
4 4053100
16.3%
7 2855155
 
11.5%
9 1905359
 
7.7%
6 1167599
 
4.7%
0 607651
 
2.4%
3 599414
 
2.4%
8 157338
 
0.6%
2 15909
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
a 2848762
50.0%
b 2848762
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2077844
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1038922
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26959823
80.0%
Latin 6736446
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 8980061
33.3%
1 4540393
16.8%
4 4053100
15.0%
7 2855155
 
10.6%
_ 2077844
 
7.7%
9 1905359
 
7.1%
6 1167599
 
4.3%
0 607651
 
2.3%
3 599414
 
2.2%
8 157338
 
0.6%
Latin
ValueCountFrequency (%)
a 2848762
42.3%
b 2848762
42.3%
P 1038922
 
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33696269
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 8980061
26.7%
1 4540393
13.5%
4 4053100
12.0%
7 2855155
 
8.5%
a 2848762
 
8.5%
b 2848762
 
8.5%
_ 2077844
 
6.2%
9 1905359
 
5.7%
6 1167599
 
3.5%
P 1038922
 
3.1%
Other values (4) 1380312
 
4.1%
Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:52.986829image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length8.610769548
Min length8

Characters and Unicode

Total characters33475951
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa55475b1
2nd rowa55475b1
3rd rowa55475b1
4th rowa55475b1
5th rowa55475b1
ValueCountFrequency (%)
a55475b1 3062831
78.8%
p94_109_143 762661
 
19.6%
p30_86_84 29604
 
0.8%
p52_67_90 12511
 
0.3%
p69_72_116 7443
 
0.2%
p129_162_80 7216
 
0.2%
p84_14_61 2946
 
0.1%
p64_121_167 897
 
< 0.1%
p5_143_178 635
 
< 0.1%
p19_25_34 622
 
< 0.1%
2024-02-13T20:36:53.282885image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 9202897
27.5%
1 4628582
13.8%
4 4625803
13.8%
7 3084317
 
9.2%
a 3062831
 
9.1%
b 3062831
 
9.1%
_ 1649706
 
4.9%
9 1553114
 
4.6%
P 824853
 
2.5%
0 812310
 
2.4%
Other values (4) 968707
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24875730
74.3%
Lowercase Letter 6125662
 
18.3%
Connector Punctuation 1649706
 
4.9%
Uppercase Letter 824853
 
2.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 9202897
37.0%
1 4628582
18.6%
4 4625803
18.6%
7 3084317
 
12.4%
9 1553114
 
6.2%
0 812310
 
3.3%
3 793840
 
3.2%
8 70005
 
0.3%
6 68957
 
0.3%
2 35905
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
a 3062831
50.0%
b 3062831
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1649706
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 824853
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26525436
79.2%
Latin 6950515
 
20.8%

Most frequent character per script

Common
ValueCountFrequency (%)
5 9202897
34.7%
1 4628582
17.4%
4 4625803
17.4%
7 3084317
 
11.6%
_ 1649706
 
6.2%
9 1553114
 
5.9%
0 812310
 
3.1%
3 793840
 
3.0%
8 70005
 
0.3%
6 68957
 
0.3%
Latin
ValueCountFrequency (%)
a 3062831
44.1%
b 3062831
44.1%
P 824853
 
11.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33475951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 9202897
27.5%
1 4628582
13.8%
4 4625803
13.8%
7 3084317
 
9.2%
a 3062831
 
9.1%
b 3062831
 
9.1%
_ 1649706
 
4.9%
9 1553114
 
4.6%
P 824853
 
2.5%
0 812310
 
2.4%
Other values (4) 968707
 
2.9%

revolvingaccount_394A
Real number (ℝ)

MISSING 

Distinct60659
Distinct (%)38.7%
Missing3731033
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean740354832.4
Minimum540342400
Maximum780865400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:53.433848image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum540342400
5-th percentile561029650
Q1740688450
median760248700
Q3760724950
95-th percentile780564600
Maximum780865400
Range240523000
Interquartile range (IQR)20036500

Descriptive statistics

Standard deviation54986858.03
Coefficient of variation (CV)0.07427095174
Kurtosis5.600146409
Mean740354832.4
Median Absolute Deviation (MAD)19415740
Skewness-2.485268947
Sum1.159773249 × 1014
Variance3.023554556 × 1015
MonotonicityNot monotonic
2024-02-13T20:36:53.584839image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
760540100 35
 
< 0.1%
742529500 31
 
< 0.1%
760482240 30
 
< 0.1%
760434100 27
 
< 0.1%
760635840 26
 
< 0.1%
760467400 25
 
< 0.1%
760470700 25
 
< 0.1%
760540350 24
 
< 0.1%
760558800 24
 
< 0.1%
760447170 23
 
< 0.1%
Other values (60649) 156381
 
4.0%
(Missing) 3731033
96.0%
ValueCountFrequency (%)
540342400 3
< 0.1%
540342460 7
< 0.1%
540342500 5
< 0.1%
540342600 4
< 0.1%
540342660 3
< 0.1%
ValueCountFrequency (%)
780865400 1
< 0.1%
780865200 1
< 0.1%
780864800 2
< 0.1%
780864700 2
< 0.1%
780864260 1
< 0.1%
Distinct11
Distinct (%)< 0.1%
Missing35
Missing (%)< 0.1%
Memory size29.7 MiB
2024-02-13T20:36:53.692788image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3887649
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowD
3rd rowD
4th rowT
5th rowT
ValueCountFrequency (%)
k 1605077
41.3%
d 1563834
40.2%
a 431299
 
11.1%
t 263947
 
6.8%
n 15668
 
0.4%
q 4766
 
0.1%
s 2265
 
0.1%
l 470
 
< 0.1%
h 276
 
< 0.1%
p 39
 
< 0.1%
2024-02-13T20:36:53.934680image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 1605077
41.3%
D 1563834
40.2%
A 431299
 
11.1%
T 263947
 
6.8%
N 15668
 
0.4%
Q 4766
 
0.1%
S 2265
 
0.1%
L 470
 
< 0.1%
H 276
 
< 0.1%
P 39
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3887649
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 1605077
41.3%
D 1563834
40.2%
A 431299
 
11.1%
T 263947
 
6.8%
N 15668
 
0.4%
Q 4766
 
0.1%
S 2265
 
0.1%
L 470
 
< 0.1%
H 276
 
< 0.1%
P 39
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3887649
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
K 1605077
41.3%
D 1563834
40.2%
A 431299
 
11.1%
T 263947
 
6.8%
N 15668
 
0.4%
Q 4766
 
0.1%
S 2265
 
0.1%
L 470
 
< 0.1%
H 276
 
< 0.1%
P 39
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3887649
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 1605077
41.3%
D 1563834
40.2%
A 431299
 
11.1%
T 263947
 
6.8%
N 15668
 
0.4%
Q 4766
 
0.1%
S 2265
 
0.1%
L 470
 
< 0.1%
H 276
 
< 0.1%
P 39
 
< 0.1%

tenor_203L
Real number (ℝ)

MISSING 

Distinct57
Distinct (%)< 0.1%
Missing312833
Missing (%)8.0%
Infinite0
Infinite (%)0.0%
Mean15.78210253
Minimum3
Maximum62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 MiB
2024-02-13T20:36:54.081715image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q16
median12
Q324
95-th percentile36
Maximum62
Range59
Interquartile range (IQR)18

Descriptive statistics

Standard deviation10.46206069
Coefficient of variation (CV)0.662906648
Kurtosis1.294429223
Mean15.78210253
Median Absolute Deviation (MAD)6
Skewness1.219734053
Sum56418665
Variance109.4547138
MonotonicityNot monotonic
2024-02-13T20:36:54.226911image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 883553
22.7%
6 589676
15.2%
24 562114
14.5%
18 296861
 
7.6%
36 190785
 
4.9%
3 188014
 
4.8%
16 133452
 
3.4%
48 102226
 
2.6%
9 78979
 
2.0%
10 75231
 
1.9%
Other values (47) 473960
12.2%
(Missing) 312833
 
8.0%
ValueCountFrequency (%)
3 188014
 
4.8%
4 73219
 
1.9%
5 43429
 
1.1%
6 589676
15.2%
7 7541
 
0.2%
ValueCountFrequency (%)
62 1
 
< 0.1%
61 3
 
< 0.1%
60 3148
0.1%
58 46
 
< 0.1%
56 23
 
< 0.1%