I have users table with 24K users on my Django website and I need to retrieve information for each of my users by sending a request to a remote API endpoint which's rate limited (15 requests/minute).
So my plan is to use the Celery periodic tasks with a new model called "Job". There are two ways in my perspective:
1. For each user I will create a new Job instance with the ForeignKey relation to this user.
2. There will be a single Job instance and this Job instance will have a "users" ManyToManyField field on it.
Then I'll process the Job instance(s) with Celery, for example I can process one Job instance on each run of periodic task for the first way above. But..there will be a huge amount of db objects for each bulk request series...
Both of them seem bad to me since they are operations with big loads. Am I wrong? I guess there should be more convenient way of doing so. Can you please suggest me a better way, or my ways are good enough to implement?
You could add a field to your user model last_updated
, you can then set up a task to run every minute that selects the 15 users that were updated last
class User(AbstractUser):
last_updated = models.DateTimeField(default=timezone.now, db_index=True)
def task():
users = User.objects.order_by('last_updated')[:15]
for user in users:
# perform API call and update user.last_updated to be now
This way you would not have to set up a complicated job queue/table