I am working on python on some data get from a SAS server. I am currently using SASPY to_df() function to bring it from SAS to local pandas.
I would like to know if its possible to filter/query the data that is being transferred so I could avoid bringing unneeded that and speeding up my download.
I couldn't find anything on saspy documentation, it only offers the possibility of using "**kwargs" but I couldn't figure out how to do it.
Thanks.
You need to define the sasdata object using the WHERE= dataset option to limit the observations pulled.
https://sassoftware.github.io/saspy/api.html#saspy.sasdata.SASdata
Then when you use the to_df() method only the selected data will be transferred.
You can also use the KEEP= or DROP= dataset option to limit the variables that are transferred. Remember that in order to reference any variables in the WHERE= option they have to be kept.
The "**kwargs" looks to be about changing how you connect to the SAS server, so that is not important for what you want.