I'm converting from XML to CSV with xml2 | 2csv
tool, then call .import
to SQLite. I don't want to generate a CSV file, but want to pipe the converted file from xml2
to the .import
query.
I have tags.xml:
<tags>
<row Id="1" TagName=".net" Count="316293" ExcerptPostId="3624959" WikiPostId="3607476" />
<row Id="2" TagName="html" Count="1116853" ExcerptPostId="3673183" WikiPostId="3673182" />
<row Id="3" TagName="javascript" Count="2343663" ExcerptPostId="3624960" WikiPostId="3607052" />
</tags>
So instead of doing this conversion:
xml2 | 2csv row @Id @TagName @Count @ExcerptPostId @WikiPostId < tags.xml > tags.csv
Then importing to database from command line:
sqlite3 mydb.db -csv ".import tags.csv tags"
I want something like this from the command line:
sqlite3 mydb.db -csv ".import <xml2 | 2csv command> tags"
My XML file is around 100GB so I don't want to materialize it.
I do the same with a 90 GB file, here is how I import it:
xml2 < Posts.xml | 2csv row @Id @PostTypeId @AcceptedAnswerId @CreationDate \
@Score @ViewCount @Body @OwnerUserId @LastEditorUserId \
@LastEditorDisplayName @LastEditDate @LastActivityDate \
@Title @Tags @AnswerCount @CommentCount @FavoriteCount \
@CommunityOwnedDate @ContentLicense | \
sqlite3 stack.db -csv ".import /dev/stdin posts"
The conversion with xml2
happens via pipes, and the stdout of 2csv
is passed directly into the stdin of sqlite3