i have to answer two questions:
groupby().count
can't do the thing. Also, some user may had chosen MySQL and some other database, so he cannot be counted to 2nd question.
What should i do? i tried many solutions but they all led me to nothingi came up with this
import re
query = 'SELECT * FROM DatabaseWorkedWith'
df = pd.read_sql_query(query, conn)
pass
inde_list = list()
for index in df.index:
if re.search('SQL{1}', df.loc[index, 'DatabaseWorkedWith']):
respondent = df.loc[index, 'Respondent']
if respondent not in inde_list:
inde_list.append(respondent)
else:
df.drop(index, inplace=True)
del inde_list
df
(for some reason i cannot prettify the format of this code) but there must be a better way and this still deals with only half a problem
How I would approach the problem (there might be betters ways)
DatabaseWorkedWith
column as follows and then drop all rows with false and also drop all duplicates in Respondent column to get all the unique usersDatabaseWorkedWith
column based on where it matched MYSQL and the corresponding value in the respondents columns is your answer