Cross-tablulation examples
We will use for demonstartion well-known mtcars
dataset.
Let’s start with adding labels to the dataset. Then we can continue with
tables creation.
library(expss)
data(mtcars)
mtcars = apply_labels(mtcars,
mpg = "Miles/(US) gallon",
cyl = "Number of cylinders",
disp = "Displacement (cu.in.)",
hp = "Gross horsepower",
drat = "Rear axle ratio",
wt = "Weight (1000 lbs)",
qsec = "1/4 mile time",
vs = "Engine",
vs = c("V-engine" = 0,
"Straight engine" = 1),
am = "Transmission",
am = c("Automatic" = 0,
"Manual"=1),
gear = "Number of forward gears",
carb = "Number of carburetors"
)
For quick cross-tabulation there are fre
and
cross
family of function. For simplicity we demonstrate
here only cross_cpct
which calculates column percent.
Documentation for other functions, such as cross_cases
for
counts, cross_rpct
for row percent, cross_tpct
for table percent and cross_fun
for custom summary
functions can be seen by typing ?cross_cpct
and
?cross_fun
in the console.
# 'cross_*' examples
# just simple crosstabulation, similar to base R 'table' function
cross_cases(mtcars, am, vs)
|
Engine
|
|
V-engine
|
Straight engine
|
Transmission
|
Automatic
|
12
|
7
|
Manual
|
6
|
7
|
#Total cases
|
18
|
14
|
# Table column % with multiple banners
cross_cpct(mtcars, cyl, list(total(), am, vs))
|
#Total
|
|
Transmission
|
|
Engine
|
|
|
|
Automatic
|
Manual
|
|
V-engine
|
Straight engine
|
Number of cylinders
|
4
|
34.4
|
|
15.8
|
61.5
|
|
5.6
|
71.4
|
6
|
21.9
|
|
21.1
|
23.1
|
|
16.7
|
28.6
|
8
|
43.8
|
|
63.2
|
15.4
|
|
77.8
|
|
#Total cases
|
32
|
|
19
|
13
|
|
18
|
14
|
# magrittr pipe usage and nested banners
mtcars %>%
cross_cpct(cyl, list(total(), am %nest% vs))
|
#Total
|
|
Transmission
|
|
|
|
Automatic
|
|
Manual
|
|
|
|
Engine
|
|
Engine
|
|
|
|
V-engine
|
Straight engine
|
|
V-engine
|
Straight engine
|
Number of cylinders
|
4
|
34.4
|
|
|
42.9
|
|
16.7
|
100
|
6
|
21.9
|
|
|
57.1
|
|
50.0
|
|
8
|
43.8
|
|
100
|
|
|
33.3
|
|
#Total cases
|
32
|
|
12
|
7
|
|
6
|
7
|
We have more sophisticated interface for table construction with
magrittr
piping. Table construction consists of at least of
three functions chained with pipe operator: %>%
. At
first we need to specify variables for which statistics will be computed
with tab_cells
. Secondary, we calculate statistics with one
of the tab_stat_*
functions. And last, we finalize table
creation with tab_pivot
, e. g.:
dataset %>% tab_cells(variable) %>% tab_stat_cases() %>% tab_pivot()
.
After that we can optionally sort table with tab_sort_asc
,
drop empty rows/columns with drop_rc
and transpose with
tab_transpose
. Resulting table is just a
data.frame
so we can use usual R operations on it. Detailed
documentation for table creation can be seen via ?tables
.
For significance testing see ?significance
. Generally,
tables automatically translated to HTML for output in knitr or Jupyter
notebooks. However, if we want HTML output in the R notebooks or in the
RStudio viewer we need to set options for that:
expss_output_rnotebook()
or
expss_output_viewer()
.
# simple example
mtcars %>%
tab_cells(cyl) %>%
tab_cols(total(), am) %>%
tab_stat_cpct() %>%
tab_pivot()
|
#Total
|
|
Transmission
|
|
|
|
Automatic
|
Manual
|
Number of cylinders
|
4
|
34.4
|
|
15.8
|
61.5
|
6
|
21.9
|
|
21.1
|
23.1
|
8
|
43.8
|
|
63.2
|
15.4
|
#Total cases
|
32
|
|
19
|
13
|
# table with caption
mtcars %>%
tab_cells(mpg, disp, hp, wt, qsec) %>%
tab_cols(total(), am) %>%
tab_stat_mean_sd_n() %>%
tab_last_sig_means(subtable_marks = "both") %>%
tab_pivot() %>%
set_caption("Table with summary statistics and significance marks.")
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
Table with summary statistics and significance marks.
|
|
#Total
|
|
Transmission
|
|
|
|
Automatic
|
|
Manual
|
|
|
|
A
|
|
B
|
Miles/(US) gallon
|
Mean
|
20.1
|
|
17.1 < B
|
|
24.4 > A
|
Std. dev.
|
6.0
|
|
3.8
|
|
6.2
|
Unw. valid N
|
32.0
|
|
19.0
|
|
13.0
|
Displacement (cu.in.)
|
Mean
|
230.7
|
|
290.4 > B
|
|
143.5 < A
|
Std. dev.
|
123.9
|
|
110.2
|
|
87.2
|
Unw. valid N
|
32.0
|
|
19.0
|
|
13.0
|
Gross horsepower
|
Mean
|
146.7
|
|
160.3
|
|
126.8
|
Std. dev.
|
68.6
|
|
53.9
|
|
84.1
|
Unw. valid N
|
32.0
|
|
19.0
|
|
13.0
|
Weight (1000 lbs)
|
Mean
|
3.2
|
|
3.8 > B
|
|
2.4 < A
|
Std. dev.
|
1.0
|
|
0.8
|
|
0.6
|
Unw. valid N
|
32.0
|
|
19.0
|
|
13.0
|
1/4 mile time
|
Mean
|
17.8
|
|
18.2
|
|
17.4
|
Std. dev.
|
1.8
|
|
1.8
|
|
1.8
|
Unw. valid N
|
32.0
|
|
19.0
|
|
13.0
|
# Table with the same summary statistics. Statistics labels in columns.
mtcars %>%
tab_cells(mpg, disp, hp, wt, qsec) %>%
tab_cols(total(label = "#Total| |"), am) %>%
tab_stat_fun(Mean = w_mean, "Std. dev." = w_sd, "Valid N" = w_n, method = list) %>%
tab_pivot()
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
#Total
|
|
Transmission
|
|
|
|
Automatic
|
|
Manual
|
|
Mean
|
Std. dev.
|
Valid N
|
|
Mean
|
Std. dev.
|
Valid N
|
|
Mean
|
Std. dev.
|
Valid N
|
Miles/(US) gallon
|
20.1
|
6.0
|
32
|
|
17.1
|
3.8
|
19
|
|
24.4
|
6.2
|
13
|
Displacement (cu.in.)
|
230.7
|
123.9
|
32
|
|
290.4
|
110.2
|
19
|
|
143.5
|
87.2
|
13
|
Gross horsepower
|
146.7
|
68.6
|
32
|
|
160.3
|
53.9
|
19
|
|
126.8
|
84.1
|
13
|
Weight (1000 lbs)
|
3.2
|
1.0
|
32
|
|
3.8
|
0.8
|
19
|
|
2.4
|
0.6
|
13
|
1/4 mile time
|
17.8
|
1.8
|
32
|
|
18.2
|
1.8
|
19
|
|
17.4
|
1.8
|
13
|
# Different statistics for different variables.
mtcars %>%
tab_cols(total(), vs) %>%
tab_cells(mpg) %>%
tab_stat_mean() %>%
tab_stat_valid_n() %>%
tab_cells(am) %>%
tab_stat_cpct(total_row_position = "none", label = "col %") %>%
tab_stat_rpct(total_row_position = "none", label = "row %") %>%
tab_stat_tpct(total_row_position = "none", label = "table %") %>%
tab_pivot(stat_position = "inside_rows")
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
|
|
#Total
|
|
Engine
|
|
|
|
|
|
V-engine
|
Straight engine
|
Miles/(US) gallon
|
Mean
|
|
|
20.1
|
|
16.6
|
24.6
|
Valid N
|
|
|
32.0
|
|
18.0
|
14.0
|
Transmission
|
Automatic
|
col %
|
|
59.4
|
|
66.7
|
50.0
|
|
row %
|
|
100.0
|
|
63.2
|
36.8
|
|
table %
|
|
59.4
|
|
37.5
|
21.9
|
Manual
|
col %
|
|
40.6
|
|
33.3
|
50.0
|
|
row %
|
|
100.0
|
|
46.2
|
53.8
|
|
table %
|
|
40.6
|
|
18.8
|
21.9
|
# Table with split by rows and with custom totals.
mtcars %>%
tab_cells(cyl) %>%
tab_cols(total(), vs) %>%
tab_rows(am) %>%
tab_stat_cpct(total_row_position = "above",
total_label = c("number of cases", "row %"),
total_statistic = c("u_cases", "u_rpct")) %>%
tab_pivot()
|
|
|
#Total
|
|
Engine
|
|
|
|
|
|
|
V-engine
|
Straight engine
|
Transmission
|
Automatic
|
Number of cylinders
|
#number of cases
|
|
19
|
|
12
|
7
|
|
|
#row %
|
|
100
|
|
63.2
|
36.8
|
|
|
4
|
|
15.8
|
|
|
42.9
|
|
|
6
|
|
21.1
|
|
|
57.1
|
|
|
8
|
|
63.2
|
|
100.0
|
|
Manual
|
Number of cylinders
|
#number of cases
|
|
13
|
|
6
|
7
|
|
|
#row %
|
|
100
|
|
46.2
|
53.8
|
|
|
4
|
|
61.5
|
|
16.7
|
100.0
|
|
|
6
|
|
23.1
|
|
50.0
|
|
|
|
8
|
|
15.4
|
|
33.3
|
|
# Linear regression by groups.
mtcars %>%
tab_cells(sheet(mpg, disp, hp, wt, qsec)) %>%
tab_cols(total(label = "#Total| |"), am) %>%
tab_stat_fun_df(
function(x){
frm = reformulate(".", response = as.name(names(x)[1]))
model = lm(frm, data = x)
sheet('Coef.' = coef(model),
confint(model)
)
}
) %>%
tab_pivot()
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
#Total
|
|
Transmission
|
|
|
|
Automatic
|
|
Manual
|
|
Coef.
|
2.5 %
|
97.5 %
|
|
Coef.
|
2.5 %
|
97.5 %
|
|
Coef.
|
2.5 %
|
97.5 %
|
(Intercept)
|
27.3
|
9.6
|
45.1
|
|
21.8
|
-1.9
|
45.5
|
|
13.3
|
-21.9
|
48.4
|
Displacement (cu.in.)
|
0.0
|
0.0
|
0.0
|
|
0.0
|
0.0
|
0.0
|
|
0.0
|
-0.1
|
0.1
|
Gross horsepower
|
0.0
|
-0.1
|
0.0
|
|
0.0
|
-0.1
|
0.0
|
|
0.0
|
0.0
|
0.1
|
Weight (1000 lbs)
|
-4.6
|
-7.2
|
-2.0
|
|
-2.3
|
-5.0
|
0.4
|
|
-7.7
|
-12.5
|
-2.9
|
1/4 mile time
|
0.5
|
-0.4
|
1.5
|
|
0.4
|
-0.7
|
1.6
|
|
1.6
|
-0.2
|
3.4
|
Example of data processing with multiple-response variables
Here we use truncated dataset with data from product test of two
samples of chocolate sweets. 150 respondents tested two kinds of sweets
(codenames: VSX123 and SDF546). Sample was divided into two groups
(cells) of 75 respondents in each group. In cell 1 product VSX123 was
presented first and then SDF546. In cell 2 sweets were presented in
reversed order. Questions about respondent impressions about first
product are in the block A (and about second tested product in the block
B). At the end of the questionnaire there was a question about the
preferences between sweets.
List of variables:
id
Respondent Id
cell
First tested product (cell number)
s2a
Age
a1_1-a1_6
What did you like in these sweets? Multiple
response. First tested product
a22
Overall quality. First tested product
b1_1-b1_6
What did you like in these sweets? Multiple
response. Second tested product
b22
Overall quality. Second tested product
c1
Preferences
data(product_test)
w = product_test # shorter name to save some keystrokes
# here we recode variables from first/second tested product to separate variables for each product according to their cells
# 'h' variables - VSX123 sample, 'p' variables - 'SDF456' sample
# also we recode preferences from first/second product to true names
# for first cell there are no changes, for second cell we should exchange 1 and 2.
w = w %>%
let_if(cell == 1,
h1_1 %to% h1_6 := recode(a1_1 %to% a1_6, other ~ copy),
p1_1 %to% p1_6 := recode(b1_1 %to% b1_6, other ~ copy),
h22 := recode(a22, other ~ copy),
p22 := recode(b22, other ~ copy),
c1r = c1
) %>%
let_if(cell == 2,
p1_1 %to% p1_6 := recode(a1_1 %to% a1_6, other ~ copy),
h1_1 %to% h1_6 := recode(b1_1 %to% b1_6, other ~ copy),
p22 := recode(a22, other ~ copy),
h22 := recode(b22, other ~ copy),
c1r := recode(c1, 1 ~ 2, 2 ~ 1, other ~ copy)
) %>%
let(
# recode age by groups
age_cat = recode(s2a, lo %thru% 25 ~ 1, lo %thru% hi ~ 2),
# count number of likes
# codes 2 and 99 are ignored.
h_likes = count_row_if(1 | 3 %thru% 98, h1_1 %to% h1_6),
p_likes = count_row_if(1 | 3 %thru% 98, p1_1 %to% p1_6)
)
# here we prepare labels for future usage
codeframe_likes = num_lab("
1 Liked everything
2 Disliked everything
3 Chocolate
4 Appearance
5 Taste
6 Stuffing
7 Nuts
8 Consistency
98 Other
99 Hard to answer
")
overall_liking_scale = num_lab("
1 Extremely poor
2 Very poor
3 Quite poor
4 Neither good, nor poor
5 Quite good
6 Very good
7 Excellent
")
w = apply_labels(w,
c1r = "Preferences",
c1r = num_lab("
1 VSX123
2 SDF456
3 Hard to say
"),
age_cat = "Age",
age_cat = c("18 - 25" = 1, "26 - 35" = 2),
h1_1 = "Likes. VSX123",
p1_1 = "Likes. SDF456",
h1_1 = codeframe_likes,
p1_1 = codeframe_likes,
h_likes = "Number of likes. VSX123",
p_likes = "Number of likes. SDF456",
h22 = "Overall quality. VSX123",
p22 = "Overall quality. SDF456",
h22 = overall_liking_scale,
p22 = overall_liking_scale
)
Are there any significant differences between preferences? Yes,
difference is significant.
# 'tab_mis_val(3)' remove 'hard to say' from vector
w %>% tab_cols(total(), age_cat) %>%
tab_cells(c1r) %>%
tab_mis_val(3) %>%
tab_stat_cases() %>%
tab_last_sig_cases() %>%
tab_pivot()
|
#Total
|
|
Age
|
|
|
|
18 - 25
|
26 - 35
|
Preferences
|
VSX123
|
94.0
|
|
46.0
|
48.0
|
SDF456
|
50.0
|
|
22.0
|
28.0
|
Hard to say
|
|
|
|
|
#Chi-squared p-value
|
<0.05
|
|
(warn.)
|
|
#Total cases
|
144.0
|
|
68.0
|
76.0
|
Further we calculate distribution of answers in the survey
questions.
# lets specify repeated parts of table creation chains
banner = w %>% tab_cols(total(), age_cat, c1r)
# column percent with significance
tab_cpct_sig = . %>% tab_stat_cpct() %>%
tab_last_sig_cpct(sig_labels = paste0("<b>",LETTERS, "</b>"))
# means with siginifcance
tab_means_sig = . %>% tab_stat_mean_sd_n(labels = c("<b><u>Mean</u></b>", "sd", "N")) %>%
tab_last_sig_means(
sig_labels = paste0("<b>",LETTERS, "</b>"),
keep = "means")
# Preferences
banner %>%
tab_cells(c1r) %>%
tab_cpct_sig() %>%
tab_pivot()
|
#Total
|
|
Age
|
|
Preferences
|
|
|
|
18 - 25
|
|
26 - 35
|
|
VSX123
|
|
SDF456
|
|
Hard to say
|
|
|
|
A
|
|
B
|
|
A
|
|
B
|
|
C
|
Preferences
|
VSX123
|
62.7
|
|
65.7
|
|
60.0
|
|
100.0
|
|
|
|
|
SDF456
|
33.3
|
|
31.4
|
|
35.0
|
|
|
|
100.0
|
|
|
Hard to say
|
4.0
|
|
2.9
|
|
5.0
|
|
|
|
|
|
100.0
|
#Total cases
|
150
|
|
70
|
|
80
|
|
94
|
|
50
|
|
6
|
# Overall liking
banner %>%
tab_cells(h22) %>%
tab_means_sig() %>%
tab_cpct_sig() %>%
tab_cells(p22) %>%
tab_means_sig() %>%
tab_cpct_sig() %>%
tab_pivot()
## Warning in as.data.frame.integer(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.integer()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.integer(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.integer()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
#Total
|
|
Age
|
|
Preferences
|
|
|
|
18 - 25
|
|
26 - 35
|
|
VSX123
|
|
SDF456
|
|
Hard to say
|
|
|
|
A
|
|
B
|
|
A
|
|
B
|
|
C
|
Overall quality. VSX123
|
Mean
|
5.5
|
|
5.4
|
|
5.6
|
|
5.3
|
|
5.8 A
|
|
5.5
|
Extremely poor
|
|
|
|
|
|
|
|
|
|
|
|
Very poor
|
|
|
|
|
|
|
|
|
|
|
|
Quite poor
|
2.0
|
|
2.9
|
|
1.2
|
|
3.2
|
|
|
|
|
Neither good, nor poor
|
10.7
|
|
11.4
|
|
10.0
|
|
14.9 B
|
|
2.0
|
|
16.7
|
Quite good
|
39.3
|
|
45.7
|
|
33.8
|
|
40.4
|
|
38.0
|
|
33.3
|
Very good
|
33.3
|
|
24.3
|
|
41.2 A
|
|
30.9
|
|
38.0
|
|
33.3
|
Excellent
|
14.7
|
|
15.7
|
|
13.8
|
|
10.6
|
|
22.0
|
|
16.7
|
#Total cases
|
150
|
|
70
|
|
80
|
|
94
|
|
50
|
|
6
|
Overall quality. SDF456
|
Mean
|
5.4
|
|
5.3
|
|
5.4
|
|
5.4
|
|
5.3
|
|
5.7
|
Extremely poor
|
|
|
|
|
|
|
|
|
|
|
|
Very poor
|
0.7
|
|
|
|
1.2
|
|
1.1
|
|
|
|
|
Quite poor
|
2.7
|
|
4.3
|
|
1.2
|
|
2.1
|
|
4.0
|
|
|
Neither good, nor poor
|
16.7
|
|
20.0
|
|
13.8
|
|
18.1
|
|
14.0
|
|
16.7
|
Quite good
|
31.3
|
|
27.1
|
|
35.0
|
|
28.7
|
|
38.0
|
|
16.7
|
Very good
|
35.3
|
|
35.7
|
|
35.0
|
|
35.1
|
|
34.0
|
|
50.0
|
Excellent
|
13.3
|
|
12.9
|
|
13.8
|
|
14.9
|
|
10.0
|
|
16.7
|
#Total cases
|
150
|
|
70
|
|
80
|
|
94
|
|
50
|
|
6
|
# Likes
banner %>%
tab_cells(h_likes) %>%
tab_means_sig() %>%
tab_cells(mrset(h1_1 %to% h1_6)) %>%
tab_cpct_sig() %>%
tab_cells(p_likes) %>%
tab_means_sig() %>%
tab_cells(mrset(p1_1 %to% p1_6)) %>%
tab_cpct_sig() %>%
tab_pivot()
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
#Total
|
|
Age
|
|
Preferences
|
|
|
|
18 - 25
|
|
26 - 35
|
|
VSX123
|
|
SDF456
|
|
Hard to say
|
|
|
|
A
|
|
B
|
|
A
|
|
B
|
|
C
|
Number of likes. VSX123
|
Mean
|
2.0
|
|
2.0
|
|
2.1
|
|
1.9
|
|
2.2
|
|
2.3
|
Likes. VSX123
|
Liked everything
|
|
|
|
|
|
|
|
|
|
|
|
Disliked everything
|
3.3
|
|
1.4
|
|
5.0
|
|
4.3
|
|
2.0
|
|
|
Chocolate
|
34.0
|
|
38.6
|
|
30.0
|
|
35.1
|
|
32.0
|
|
33.3
|
Appearance
|
29.3
|
|
21.4
|
|
36.2 A
|
|
25.5
|
|
38.0
|
|
16.7
|
Taste
|
32.0
|
|
38.6
|
|
26.2
|
|
23.4
|
|
48.0 A
|
|
33.3
|
Stuffing
|
27.3
|
|
20.0
|
|
33.8
|
|
28.7
|
|
26.0
|
|
16.7
|
Nuts
|
66.7
|
|
72.9
|
|
61.3
|
|
69.1
|
|
60.0
|
|
83.3
|
Consistency
|
12.0
|
|
4.3
|
|
18.8 A
|
|
8.5
|
|
14.0
|
|
50.0 A B
|
Other
|
|
|
|
|
|
|
|
|
|
|
|
Hard to answer
|
|
|
|
|
|
|
|
|
|
|
|
#Total cases
|
150
|
|
70
|
|
80
|
|
94
|
|
50
|
|
6
|
Number of likes. SDF456
|
Mean
|
2.0
|
|
2.0
|
|
2.1
|
|
2.0
|
|
2.0
|
|
2.0
|
Likes. SDF456
|
Liked everything
|
|
|
|
|
|
|
|
|
|
|
|
Disliked everything
|
1.3
|
|
1.4
|
|
1.2
|
|
2.1
|
|
|
|
|
Chocolate
|
32.0
|
|
27.1
|
|
36.2
|
|
29.8
|
|
34.0
|
|
50.0
|
Appearance
|
32.0
|
|
35.7
|
|
28.7
|
|
34.0
|
|
30.0
|
|
16.7
|
Taste
|
39.3
|
|
42.9
|
|
36.2
|
|
36.2
|
|
44.0
|
|
50.0
|
Stuffing
|
27.3
|
|
24.3
|
|
30.0
|
|
31.9
|
|
20.0
|
|
16.7
|
Nuts
|
61.3
|
|
60.0
|
|
62.5
|
|
58.5
|
|
68.0
|
|
50.0
|
Consistency
|
10.0
|
|
5.7
|
|
13.8
|
|
11.7
|
|
6.0
|
|
16.7
|
Other
|
0.7
|
|
|
|
1.2
|
|
1.1
|
|
|
|
|
Hard to answer
|
|
|
|
|
|
|
|
|
|
|
|
#Total cases
|
150
|
|
70
|
|
80
|
|
94
|
|
50
|
|
6
|
# below more complicated table where we compare likes side by side
# Likes - side by side comparison
w %>%
tab_cols(total(label = "#Total| |"), c1r) %>%
tab_cells(list(unvr(mrset(h1_1 %to% h1_6)))) %>%
tab_stat_cpct(label = var_lab(h1_1)) %>%
tab_cells(list(unvr(mrset(p1_1 %to% p1_6)))) %>%
tab_stat_cpct(label = var_lab(p1_1)) %>%
tab_pivot(stat_position = "inside_columns")
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
## Warning in as.data.frame.numeric(x[[i]], optional = TRUE, nm = "x[[i]]", :
## Direct call of 'as.data.frame.numeric()' is deprecated. Use
## 'as.data.frame.vector()' or 'as.data.frame()' instead
|
#Total
|
|
Preferences
|
|
|
|
VSX123
|
|
SDF456
|
|
Hard to say
|
|
Likes. VSX123
|
Likes. SDF456
|
|
Likes. VSX123
|
Likes. SDF456
|
|
Likes. VSX123
|
Likes. SDF456
|
|
Likes. VSX123
|
Likes. SDF456
|
Liked everything
|
|
|
|
|
|
|
|
|
|
|
|
Disliked everything
|
3.3
|
1.3
|
|
4.3
|
2.1
|
|
2
|
|
|
|
|
Chocolate
|
34.0
|
32.0
|
|
35.1
|
29.8
|
|
32
|
34
|
|
33.3
|
50.0
|
Appearance
|
29.3
|
32.0
|
|
25.5
|
34.0
|
|
38
|
30
|
|
16.7
|
16.7
|
Taste
|
32.0
|
39.3
|
|
23.4
|
36.2
|
|
48
|
44
|
|
33.3
|
50.0
|
Stuffing
|
27.3
|
27.3
|
|
28.7
|
31.9
|
|
26
|
20
|
|
16.7
|
16.7
|
Nuts
|
66.7
|
61.3
|
|
69.1
|
58.5
|
|
60
|
68
|
|
83.3
|
50.0
|
Consistency
|
12.0
|
10.0
|
|
8.5
|
11.7
|
|
14
|
6
|
|
50.0
|
16.7
|
Other
|
|
0.7
|
|
|
1.1
|
|
|
|
|
|
|
Hard to answer
|
|
|
|
|
|
|
|
|
|
|
|
#Total cases
|
150
|
150
|
|
94
|
94
|
|
50
|
50
|
|
6
|
6
|
We can save labelled dataset as *.csv file with accompanying R code
for labelling.
write_labelled_csv(w, file filename = "product_test.csv")
Or, we can save dataset as *.csv file with SPSS syntax to read data
and apply labels.
write_labelled_spss(w, file filename = "product_test.csv")