databaseparsingpubchem

Database of chemicals with associated properties?


I think pubchem has what I need here, I want a database that is -or could be converted into- a table of chemical identifier : series of properties for a school project. The issue is, pubchem is too large, the only file they offer that I know how to decode is XML (they also offer SDF and ASN, heres the link: ftp://ftp.ncbi.nlm.nih.gov/pubchem/Substance/CURRENT-Full/), and I don't have enough RAM to open the XMLs in a text editor.

Is there an alternative database I can use?

Is there a way to slice up the XML files into more manageable pieces before loading them?

Once I have the data in any openable form I will be able to parse it with code, so the data being too much to read through is not an issue.


Solution

  • I think im starting to figure out how to do this. XML stream was a good keyword to search around with. Sorry my phrasing was a little strange in the question, I wasn't really sure what I was asking from a technical standpoint.