I want to sort data in SAS data step. What exactly I mean is: the work of proc sort should be done in data step. Is there any solution?
If you're looking for a data-step only solution, you can do the work of PROC SORT
with a hash table. The caveat is that you need enough memory to do it.
If you want to do a simple sort, you would load the hash table with the ordered:'yes'
option and output it to a new table. By default, ordered:yes
will sort the data in ascending order. You can specify descending
as well.
Simple Sort
data _null_;
/* Sets up PDV without loading the table */
if(0) then set sashelp.class;
/* Load sashelp.class into memory ordered by Height. Do not remove duplicates. */
dcl hash sortit(dataset:'sashelp.class', ordered:'yes', multidata:'yes');
sortit.defineKey('Height'); * Order by height;
sortit.defineData(all:'yes'); * Keep all variables in the output dataset;
sortit.defineDone();
/* Output to a dataset called class_sorted */
sortit.Output(dataset:'class_sorted');
run;
De-duping
To remove duplicates, do the exact same operation, except remove the multidata
option. In the below table, observations (8, 9) and (15, 16) are duplicates of each other. Observations 9 and 16 will be eliminated.
data _null_;
/* Sets up PDV without loading the table */
if(0) then set sashelp.class;
/* Load sashelp.class into memory ordered by Height. Do not keep duplicates. */
dcl hash sortit(dataset:'sashelp.class', ordered:'yes');
sortit.defineKey('Height'); * Order by height;
sortit.defineData(all:'yes'); * Keep all variables in the output dataset;
sortit.defineDone();
/* Output to a dataset called class_sorted */
sortit.Output(dataset:'class_sorted');
run;