Tools/SAS

SAS - 표본 추출

2017. 12. 7. 01:42
반응형


SAS를 통한 표본 추출


sample.csv


임의추출


코드


FILENAME REFFILE 'C:\Users\\sample.csv';

PROC IMPORT DATAFILE=REFFILE
    DBMS=CSV
    OUT=WORK.population;
    GETNAMES=YES;
RUN;

proc surveyselect data=population method=srs n=200
    out=Work.sample;
run;


결과


The SAS System

The SURVEYSELECT Procedure

Selection Method Simple Random Sampling

Input Data Set POPULATION
Random Number Seed 137581001
Sample Size 200
Selection Probability 0.13708
Sampling Weight 7.295
Output Data Set SAMPLE


표본 평균 계산



코드


proc surveymeans data=sample total=1459;
    var MSSubclass;
run;


결과


The SAS System

The SURVEYMEANS Procedure

Data Summary
Number of Observations 200

Statistics
Variable N Mean Std Error of Mean 95% CL for Mean
MSSubClass 200 59.450000 2.648750 54.2267810 64.6732190


  • 이 때, 표본 평균의 분산 추정량은 (N-n)/N * s^2/n 으로 계산된다. (유한 모집단이기 때문에 유한모집단 수정계수를 곱한다.)
  • 모분산(sigma^2)을 아는 경우에는 (N-n)/(N-1) * sigma^2/n이다.


표본 비율의 추정


코드


data new;
    set sample;
    bin=(MSZoning='RL');
run;


proc surveymeans data=new total=1459;
    var bin;
run;


결과


The SAS System

The SURVEYMEANS Procedure

Data Summary
Number of Observations 200

Statistics
Variable N Mean Std Error of Mean 95% CL for Mean
bin 200 0.760000 0.028124 0.70454146 0.81545854


층화 추출법


* LotShape라는 변수를 기준으로 층화추출한다. ;

proc sort data=population;
    by MasVnrType;
run;

proc freq data=population;
    tables MasVnrType;
run;



The SAS System

The FREQ Procedure

MasVnrType Frequency Percent Cumulative
Frequency
Cumulative
Percent
BrkCmn 10 0.69 10 0.69
BrkFace 434 29.75 444 30.43
NA 16 1.10 460 31.53
None 878 60.18 1338 91.71
Stone 121 8.29 1459 100.00


proc surveyselect data=population method=srs n=(5,5,5,5,5)
    out=sample2;
    strata MasVnrType;
run;


The SAS System

The SURVEYSELECT Procedure

Selection Method Simple Random Sampling
Strata Variable MasVnrType

Input Data Set POPULATION
Random Number Seed 132740001
Number of Strata 5
Total Sample Size 25
Output Data Set SAMPLE2





반응형
반응형