I have been practicing Django for a while now. Currently I am using it in a project where I'm fetching Facebook data via GET requests and then saving it to an sqlite database using Django models. I would like to know how can I improve the following code and save a list of Facebook posts and their metrics efficiently. In my current situation, I am using a for
loop to iterate on a list containing several Facebook Posts and their respective metrics which is then associated to the specific Django model and finally saved.
def save_post(post_id, page_id):
facebook_post = Post(post_id=post_id,
access_token=fb_access_token)
post_db = PostsModel(page_id=page_id, post_id=post.post_id)
post_db.message = facebook_post.message
post_db.story = facebook_post.story
post_db.full_picture = facebook_post.full_picture
post_db.reactions_count = facebook_post.reactions_count
post_db.comments_count = facebook_post.comments_count
post_db.shares_count = facebook_post.shares_count
post_db.interactions_count = facebook_post.interactions_count
post_db.created_time = facebook_post.created_time
post_db.published = facebook_post.published
post_db.attachment_title = facebook_post.attachment_title
post_db.attachment_description = facebook_post.attachment_description
post_db.attachment_target_url = facebook_post.attachment_target_url
post_db.save()
post_db
is a Django model object instantiated using PostsModel
while Post
is a normal Python Class which I wrote. The latter is simply a collection of GET requests which fetches data from Facebook's Graph API and returns JSON data whereby I associate relevant data to class attributes (message
, 'shares_count`).
I read about the bulk_create
function from Django's documentation but I don't know how to pass on the above. I also tried using multiprocessing
and Pool
but the above function does execute. Right now, I am just iterating sequentially on a list. As the list increases in length, it takes more time to save.
def create(self, request):
page_id = request.data['page_id']
page = get_object_or_404(PagesModel, pk=page_id)
post_list = get_list_or_404(PostsModel, page_id=page_id)
for post_id in post_list:
save_post(post_id=post_id, page_id=page)
The above function gets an already saved list from the database for a specific page based on the page_id
. Then, the for
loop iterates on each post in the list and its post_id
and page
instance are sent to the save_post
function to fetch its data and save it.
Huge thanks if anyone can suggest a more effective way to tackle this. Thank you.
You are going in the right direction with the bulk_load
. Generate a list of the PostsModel
objects and then use bulk_create
to upload them into the database. An important note here is that it won't work if the posts already exist in the database. For updating posts, try bulk_update
.
def save_post(post_id, page_id):
facebook_post = Post(post_id=post_id,
access_token=fb_access_token)
post_db = PostsModel(page_id=page_id, post_id=post.post_id)
post_db.message = facebook_post.message
post_db.story = facebook_post.story
post_db.full_picture = facebook_post.full_picture
post_db.reactions_count = facebook_post.reactions_count
post_db.comments_count = facebook_post.comments_count
post_db.shares_count = facebook_post.shares_count
post_db.interactions_count = facebook_post.interactions_count
post_db.created_time = facebook_post.created_time
post_db.published = facebook_post.published
post_db.attachment_title = facebook_post.attachment_title
post_db.attachment_description = facebook_post.attachment_description
post_db.attachment_target_url = facebook_post.attachment_target_url
return post_db
def create(self, request):
page_id = request.data['page_id']
page = get_object_or_404(PagesModel, pk=page_id)
post_list = get_list_or_404(PostsModel, page_id=page_id)
post_model_list = [save_post(post_id=post_id, page_id=page) for post_id in
post_list]
PostsModel.objects.bulk_create(post_model_list)