*********************************************** * BOOTSTRAPPING THE SAMPLE MEDIAN * *********************************************** * (C) Marta García-Granero 07/2008 * * send questions to: mgarciagranero@gmail.com * * Downloaded from: http://gjyp.nl/marta/ * * Feel free to use or modify this code, but * * acknowledge the author and the web site * *********************************************** * Sample dataset *. DATA LIST FREE/copper(F8.2). BEGIN DATA 0.70 0.45 0.72 0.30 1.16 0.69 0.83 0.74 1.24 0.77 0.65 0.76 0.42 0.94 0.36 0.98 0.64 0.90 0.63 0.55 0.78 0.10 0.52 0.42 0.58 0.62 1.12 0.86 0.74 1.04 0.65 0.66 0.81 0.48 0.85 0.75 0.73 0.50 0.34 0.88 END DATA. VARIABLE LABEL copper 'Urinary copper (µmol/24hr)'. DATASET NAME OriginalData. PRESERVE. * Initialize a random seed *. SET RNG=MT. SET MTINDEX=RANDOM. SET MXLOOPS=20000. /* Must be equal to k (see below) *. * Bootstrap CI for median *. MATRIX. PRINT /TITLE='**** BOOTSTRAPPING THE MEDIAN OF A SAMPLE ****'. GET data /VAR=ALL /NAMES=vname. COMPUTE n=NROW(data). COMPUTE k=20000. /* Number of bootsamples (change it if you want) *. COMPUTE bootmed= MAKE(k,1,0). COMPUTE bootsamp=MAKE(n,1,0). LOOP i=1 TO k. /* Extracting k bootstrap samples of size n *. - COMPUTE flipcoin=1+TRUNC(n*UNIFORM(n,1)). - COMPUTE bootsamp=data(flipcoin). * Compute boot sample median (sorting algorithm by R Ristow & J Peck) *. - COMPUTE sortboot=bootsamp. - COMPUTE sortboot(GRADE(bootsamp))=bootsamp. - COMPUTE pair=(TRUNC(n/2) EQ n/2). /* Check if n is odd (0) or even (1) *. - DO IF pair EQ 0. /* Median formula for odd samples. - COMPUTE bootmed(i)=sortboot((n+1)/2). - ELSE. /* Median formula for even samples *. - COMPUTE bootmed(i)=(sortboot(n/2)+sortboot(1+(n/2)))/2. - END IF. END LOOP. * Median & Mean of bootstrapped medians *. * Sorting algorithm by R Ristow & J Peck *. COMPUTE sortedbm=bootmed. COMPUTE sortedbm(GRADE(bootmed))=bootmed. COMPUTE median=(sortedbm(k*0.5)+sortedbm(1+k*0.5))/2. COMPUTE MedMean=CSUM(bootmed)/k. * Bootstrap estimator of the standard error of the median *. COMPUTE BootSEM=SQRT((CSSQ(bootmed)-k&*(MedMean&**2))/(k-1)). PRINT {MedMean,median,BootSEM} /FORMAT='F8.3' /CLABEL='Mean(Md)','Median','SDErr(*)' /RNAME=vname /TITLE='Bootstrapped Statistics'. PRINT/TITLE='(*) Std. Deviation of bootstrapped medians'. * NP confidence interval *. COMPUTE lower1=sortedbm(k*0.025). COMPUTE upper1=sortedbm(1+k*0.975). * Parametric confidence interval *. LOOP tvalue95=1960 to 12706. - COMPUTE t95=tvalue95/1000. - COMPUTE onetail=1-tcdf(t95,(n-1)). - DO IF ABS(onetail-0.025) lt 0.00005. - BREAK. - END IF. END LOOP. COMPUTE lower2=MedMean-t95*BootSEM. COMPUTE upper2=MedMean+t95*BootSEM. PRINT {lower1,upper1;lower2,upper2} /FORMAT='F8.3' /CLABEL='Lower CL','Upper CL' /RLABEL='NonPar','Par' /TITLE='95%CI: Non parametric (percentiles 2.5 & 97.5) & Parametric (t based)'. SAVE bootmed /OUTFILE='C:\Temp\BootStrappedMedians.sav' /NAMES=vname. PRINT k /FORMAT='F8' /RLABEL='K=' /TITLE='K bootsampled medians saved to C:\Temp\BootStrappedMedians.sav'. END MATRIX. GET FILE ='C:\Temp\BootStrappedMedians.sav' . DATASET NAME BootstrappedMedians. APPLY DICTIONARY /FROM OriginalData /SOURCE VARIABLES=ALL /FILEINFO /VARINFO VARLABEL. FREQUENCIES VARIABLES=ALL /FORMAT=NOTABLE /HISTOGRAM NORMAL /STATISTICS=SKEWNESS KURTOSIS. RESTORE.