교차분석(cross-tabulation analysis)은 ‘범주형’으로 구성된 자료들 간의 연관관계를 확인하기 위해 교차표를 만들어 관계를 확인하는 분석 방법을 말한다. 이 방법에서는 변수들의 빈도를 이용하여 연관성을 파악하는데, 이 때 검정통계량으로 카이제곱($\chi^2$) 통계량을 이용한다. 이 때문에 교차분석은 카이제곱($\chi^2$) 검정이라고도 불린다.
귀무가설: 기대도수와 관측도수 간에 차이가 없다
대립가설: 기대도수와 관측도수 간에 차이가 있다
독립변수가 범주형자료이고 종속변수가 연속형자료일때 t-test 또는 분산분석(ANOVA)을 한다
독립변수 범주가 2개일때 t-test
독립변수 범주가 3개 이상일때 분산분석
독립변수가 연속형자료이고 종속변수가 연속형자료일때 상관관계분석 또는
선형회귀분석을 한다
독립변수가 연속형자료이고 종속변수가 범주형자료일때 로지스틱 회귀분석 또는
판별분석 또는 군집분석을 한다
package 'descr' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\MyCom\AppData\Local\Temp\Rtmp8OPV9d\downloaded_packages
1
library(descr)
1
2
Warning message:
"package 'descr' was built under R version 3.6.3"
1
freq(a$gender)#freq 사용하려면 descr 라이브러리 설치해줘야 함
Frequency
Percent
male
132
53.4413
female
115
46.5587
Total
247
100.0000
1
2
install.packages('ggplot2')library(ggplot2)
1
2
3
4
5
6
7
8
9
10
11
12
There is a binary version available but the source version is later:
binary source needs_compilation
ggplot2 3.3.3 3.3.5 FALSE
installing the source package 'ggplot2'
Registered S3 methods overwritten by 'tibble':
method from
format.tbl pillar
print.tbl pillar
also installing the dependency 'lme4'
There are binary versions available but the source versions are later:
binary source needs_compilation
lme4 1.1-26 1.1-27.1 TRUE
car 3.0-10 3.0-12 FALSE
Binaries will be installed
package 'lme4' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\MyCom\AppData\Local\Temp\Rtmp8OPV9d\downloaded_packages
installing the source package 'car'
Loading required package: carData
1
2
3
a$eduM<-recode(edu,"lo:2=1; 3:4=2; 5:hi=3; else='NA'")# edu 컬럼 그대로 두고 eduM 컬럼 만들어 준다. 2를 1로 변경 , 3~4를 2로 변경 해준다.
The following objects are masked from a (pos = 5):
amount, aware, count, decision, edu, gender, job, location,
marriage, mincome, promo, propensity, repurchase, satisf_al,
satisf_b, satisf_i, skin
1
max(amount)
5000000
1
min(amount)
3000
1
sum(amount)
38023000
1
mean(amount)
153939.271255061
1
var(amount)
158463699549.06
1
sd(amount)
398074.992368348
1
install.packages('psych')
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
also installing the dependencies 'tmvnsim', 'mnormt'
There is a binary version available but the source version is later:
binary source needs_compilation
psych 2.1.3 2.1.9 FALSE
package 'tmvnsim' successfully unpacked and MD5 sums checked
package 'mnormt' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\MyCom\AppData\Local\Temp\Rtmp8OPV9d\downloaded_packages
installing the source package 'psych'
1
library(psych)
1
2
3
4
5
6
7
8
9
Attaching package: 'psych'
The following object is masked from 'package:car':
logit
The following objects are masked from 'package:ggplot2':
%+%, alpha
1
describe(a)
vars
n
mean
sd
median
trimmed
mad
min
max
range
skew
kurtosis
se
gender*
1
247
1.465587e+00
4.998272e-01
1
1.457286
0.0000
1
2
1
0.13714193
-1.9891964
3.180324e-02
marriage
2
247
1.712551e+00
4.534918e-01
2
1.763819
0.0000
1
2
1
-0.93360038
-1.1329280
2.885499e-02
edu
3
247
4.566802e+00
1.709191e+00
4
4.462312
0.0000
2
8
6
0.63815084
-0.3287273
1.087532e-01
job
4
247
4.578947e+00
2.199603e+00
4
4.422111
1.4826
1
10
9
0.68090959
-0.1651951
1.399574e-01
mincome
5
247
3.757085e+00
1.674079e+00
4
3.819095
1.4826
1
6
5
-0.10186401
-1.2266191
1.065191e-01
aware
6
247
3.319838e+00
5.575692e+00
2
1.924623
0.0000
1
31
30
3.98663626
15.7864331
3.547728e-01
count
7
247
4.327935e+00
4.422061e+00
3
3.492462
2.9652
1
36
35
3.08793674
13.5854742
2.813690e-01
amount
8
247
1.539393e+05
3.980750e+05
52000
83798.994975
47443.2000
3000
5000000
4997000
8.62153257
92.2401960
2.532891e+04
decision
9
247
2.388664e+00
7.615994e-01
3
2.482412
0.0000
1
3
2
-0.77786381
-0.8701841
4.845941e-02
propensity
10
247
1.975709e+00
6.803103e-01
2
1.969849
0.0000
1
3
2
0.02958183
-0.8487310
4.328711e-02
skin
11
247
2.761134e+00
1.488311e+00
3
2.703518
2.9652
1
5
4
0.15331957
-1.3373908
9.469894e-02
promo
12
247
2.016194e+00
8.212998e-01
2
1.919598
0.0000
1
4
3
0.84726856
0.5434697
5.225806e-02
location
13
247
2.465587e+00
1.073437e+00
3
2.371859
1.4826
1
5
4
0.55038656
0.1943319
6.830114e-02
satisf_b
14
247
2.890688e+00
7.809953e-01
3
2.869347
0.0000
1
5
4
0.14047539
0.3237416
4.969354e-02
satisf_i
15
247
3.404858e+00
8.301096e-01
3
3.482412
1.4826
1
5
4
-0.69559430
0.9204758
5.281861e-02
satisf_al
16
247
3.461538e+00
7.527311e-01
4
3.512563
1.4826
1
5
4
-0.98037384
2.1617488
4.789513e-02
repurchase
17
247
3.554656e+00
7.241820e-01
4
3.633166
0.0000
1
5
4
-1.27727971
2.5541785
4.607860e-02
eduM*
18
247
2.170040e+00
6.209693e-01
2
2.211055
0.0000
1
3
2
-0.12856235
-0.5355836
3.951133e-02
1
summary(amount)
1
2
Min. 1st Qu. Median Mean 3rd Qu. Max.
3000 30000 52000 153939 100000 5000000
Warning message in chisq.test(gender, aware, correct = F):
"Chi-squared approximation may be incorrect"
Pearson's Chi-squared test
data: gender and aware
X-squared = 54.35, df = 14, p-value = 1.119e-06
1
install.packages('gmodels')
1
2
3
4
package 'gmodels' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\MyCom\AppData\Local\Temp\Rtmp8OPV9d\downloaded_packages
1
library(gmodels)
1
2
3
4
5
6
7
Warning message:
"package 'gmodels' was built under R version 3.6.3"
Attaching package: 'gmodels'
The following object is masked from 'package:descr':
CrossTable
Leave a comment