I have a large import task I need to do with core data.
Let say my core data model look like this:
Car
----
identifier
type
I fetch a list of car info JSON from my server and then I want to sync it with my core data Car
object, meaning:
If its a new car -> create a new Core Data Car
object from the new info.
If the car already exists -> update the Core Data Car
object.
So I want to do this import in background without blocking the UI and while the use scrolls a cars table view that present all the cars.
Currently I'm doing something like this:
// create background context
NSManagedObjectContext *bgContext = [[NSManagedObjectContext alloc]initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[bgContext setParentContext:self.mainContext];
[bgContext performBlock:^{
NSArray *newCarsInfo = [self fetchNewCarInfoFromServer];
// import the new data to Core Data...
// I'm trying to do an efficient import here,
// with few fetches as I can, and in batches
for (... num of batches ...) {
// do batch import...
// save bg context in the end of each batch
[bgContext save:&error];
}
// when all import batches are over I call save on the main context
// save
NSError *error = nil;
[self.mainContext save:&error];
}];
But I'm not really sure I'm doing the right thing here, for example:
Is it ok that I use setParentContext
?
I saw some examples that use it like this, but I saw other examples that don't call setParentContext
, instead they do something like this:
NSManagedObjectContext *bgContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
bgContext.persistentStoreCoordinator = self.mainContext.persistentStoreCoordinator;
bgContext.undoManager = nil;
Another thing that I'm not sure is when to call save on the main context, In my example I just call save in the end of the import, but I saw examples that uses:
[[NSNotificationCenter defaultCenter] addObserverForName:NSManagedObjectContextDidSaveNotification object:nil queue:nil usingBlock:^(NSNotification* note) {
NSManagedObjectContext *moc = self.managedObjectContext;
if (note.object != moc) {
[moc performBlock:^(){
[moc mergeChangesFromContextDidSaveNotification:note];
}];
}
}];
As I mention before, I want the user to be able to interact with the data while updating, so what if I the user change a car type while the import change the same car, is the way I wrote it safe?
This is an extremely confusing topic for people approaching Core Data for the first time. I don't say this lightly, but with experience, I am confident in saying the Apple documentation is somewhat misleading on this matter (it is in fact consistent if you read it very carefully, but they don't adequately illustrate why merging data remains in many instances a better solution than relying on parent/child contexts and simply saving from a child to the parent).
The documentation gives the strong impression parent/child contexts are the new preferred way to do background processing. However Apple neglect to highlight some strong caveats. Firstly, be aware that everything you fetch into your child context is first pulled through it's parent. Therefore it is best to limit any child of the main context running on the main thread to processing (editing) data that has already been presented in the UI on the main thread. If you use it for general synchronisation tasks it is likely you will be wanting to process data which extends far beyond the bounds of what you are currently displaying in the UI. Even if you use NSPrivateQueueConcurrencyType, for the child edit context, you will potentially be dragging a large amount of data through the main context and that can lead to bad performance and blocking. Now it is best not to make the main context a child of the context you use for synchronisation, because it won't be notified of synchronisation updates unless you are going to do that manually, plus you will be executing potentially long running tasks on a context you might need to be responsive to saves initiated as a cascade from the edit context that is a child of your main context, through the main contact and down to the data store. You will have to either manually merge the data and also possibly track what needs to be invalidated in the main context and re-sync. Not the easiest pattern.
What the Apple documentation does not make clear is that you are most likely to need a hybrid of the techniques described on the pages describing the "old" thread confinement way of doing things, and the new Parent-Child contexts way of doing things.
Your best bet is probably (and I'm giving a generic solution here, the best solution may be dependent on your detailed requirements), to have a NSPrivateQueueConcurrencyType save context as the topmost parent, which saves directly to the datastore. [Edit: you won't be doing very much directly on this context], then give that save context at least two direct children. One your NSMainQueueConcurrencyType main context you use for the UI [Edit: it's best to be disciplined and avoid ever doing any editing of the data on this context], the other a NSPrivateQueueConcurrencyType, you use to do user edits of the data and also (in option A in the attached diagram) your synchronisation tasks.
Then you make the main context the target of the NSManagedObjectContextDidSave notification generated by the sync context, and send the notifications .userInfo dictionary to the main context's mergeChangesFromContextDidSaveNotification:.
The next question to consider is where you put the user edit context (the context where edits made by the user get reflected back into the interface). If the user's actions are always confined to edits on small amounts of presented data, then making this a child of the main context again using the NSPrivateQueueConcurrencyType is your best bet and easiest to manage (save will then save edits directly into the main context and if you have an NSFetchedResultsController, the appropriate delegate method will be called automatically so your UI can process the updates controller:didChangeObject:atIndexPath:forChangeType:newIndexPath:) (again this is option A).
If on the other hand user actions might result in large amounts of data being processed, you might want to consider making it another peer of the main context and the sync context, such that the save context has three direct children. main, sync (private queue type) and edit (private queue type). I've shown this arrangement as option B on the diagram.
Similarly to the sync context you will need to [Edit: configure the main context to receive notifications] when data is saved (or if you need more granularity, when data is updated) and take action to merge the data in (typically using mergeChangesFromContextDidSaveNotification:). Note that with this arrangement, there is no need for the main context to ever call the save: method.
To understand parent/child relationships, take Option A: The parent child approach simply means if the edit context fetches NSManagedObjects, they will be "copied into" (registered with) first the save context, then the main context, then finally edit context. You will be able to make changes to them, then when you call save: on the edit context, the changes will saved just to the main context. You would have to call save: on the main context and then call save: on the save context before they will be written out to disk.
When you save from a child, up to a parent, the various NSManagedObject change and save notifications are fired. So for example if you are using a fetch results controller to manage your data for your UI, then it's delegate methods will be called so you can update the UI as appropriate.
Some consequences: If you fetch object and NSManagedObject A on the edit context, then modify it, and save, so the modifications are returned to the main context. You now have the modified object registered against both the main and the edit context. It would be bad style to do so, but you could now modify the object again on the main context and it will now be different from the object as it is stored in the edit context. If you then try to make further modifications to the object as stored in the edit context, your modifications will be out of sync with the object on the main context, and any attempt to save the edit context will raise an error.
For this reason, with an arrangement like option A, it is a good pattern to try to fetch objects, modify them, save them and reset the edit context (e.g. [editContext reset] with any single iteration of the run-loop (or within any given block passed to [editContext performBlock:]). It is also best to be disciplined and avoid ever doing any edits on the main context. Also, to re-iterate, since all processing on main is the main thread, if you fetch lots of objects to the edit context, the main context will be doing it's fetch processing on the main thread as those objects are being copied down iteratively from parent to child contexts. If there is a lot of data being processed, this can cause unresponsiveness in the UI. So if, for example you have a large store of managed objects, and you have a UI option that would result in them all being edited. It would be a bad idea in this case to configure your App like option A. In such a case option B is a better bet.
If you aren't processing thousands of objects, then option A may be entirely sufficient.
BTW don't worry too much over which option you select. It might be a good idea to start with A and if you need to change to B. It's easier than you might think to make such a change and usually has fewer consequences than you might expect.