In my project, I load a lot of data out of JSON files. To make this faster, I use one thread that is loading the files' content into a std::deque
and another thread formatting the data from the std::deque
.
The function that loads the data into the std::deque
is using a std::unique_ptr
to access the deque
. The problem is: the changes made to the unique_ptr
are not being made to the original std::deque
.
That's my first question: Does a unique_ptr
(or shared_ptr
) modify the object used to initialize it with std::make_unique()
(or std::make_shared()
)?
I solved this with using a reference, but another question still remains: Does a unique_ptr
need the object it's pointing to, like a raw pointer does?
Because, at another point of my project, I use the loaded data by giving a unique pointer to that data to another function. Do I need to store the data? Or will it be enough to store the unique_ptr
, even when the data used to initialize it is destroyed?
Here's the code I used for loading the JSONs:
#include <mutex>
#include <thread>
#include <deque>
#include <array>
#include <fstream>
typedef std::deque<std::pair<array<array<double, 28>, 28>, unsigned int>> threadLoadList;
typedef vector<std::pair < std::shared_ptr<vector<double>>, std::shared_ptr<vector<double>> >> dataPointer;
namespace data {
std::mutex mtx{};
vector<vector<double>> training_desired_output{};
vector<vector<double>> training_images{};
dataPointer pic_val_pairs{};
}
void loadTrainImages(std::unique_ptr<threadLoadList> storage) {
for(int i = 0; i < 60000; ++i) {
std::ifstream str(data::training_data_path + "\\image" + Format_number(i) + ".json");
nlohmann::json data = nlohmann::json::parse(str);
data::mtx.lock();
storage->push_back(std::make_pair<array<array<double, 28>, 28>, unsigned int>(
data["pic"],
data["val"]
));
data::mtx.unlock();
str.close();
}
}
int main() {
threadLoadList pics_vals{};
auto load = std::thread(loadTrainImages, std::make_unique<threadLoadList>(pics_vals));
while(true) {
data::mtx.lock();
unsigned int size = pics_vals.size();
data::mtx.unlock();
if(size > 0) {
std::cout << "got one\n";
data::mtx.lock();
std::pair<array<array<double, 28>, 28>, unsigned int> copy = pics_vals.front();
pics_vals.pop_front();
data::mtx.unlock();
//format 28 x 28 array
vector<double> pic{};
for(int i = 27; i > -1; --i) {
for(int n = 0; n < 28; ++n) {
pic.push_back(copy.first[i][n]);
}
}
// format val-array
vector<double> val(10);
val = { 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 };
val[copy.second] = 1.0;
// do I need to store the data (pic and val) pic_val_pairs is pointing to globally? Or is it enought, to store the pointers to them and have pic and val being destroyed?
data::training_images.push_back(pic);
data::training_desired_output.push_back(val);
data::pic_val_pairs.push_back(
std::make_pair<std::shared_ptr<vector<double>>, std::shared_ptr<vector<double>>>(
std::make_shared<vector<double>>(data::training_images.back()),
std::make_shared<vector<double>>(data::training_desired_output.back())
));
}
if(data::pic_val_pairs.size() >= 60000) {
break;
}
}
// Do something
}
I expected unique_ptr
to behave like a raw pointer.
Running the code above showed me, that pics_vals
doesn't change when loadTrainImage
modifies the pointer to pics_vals
.
I feel like unique_ptr
copies the values into a new object, because the process memory contains 800MB after loading, but the JSON files only have 180MB in total. Is that right?
For loading the data, it was easily solved by using references, but in the part //Do something
I can't use references because I need to randomly shuffle the pic_val_pairs
.
I expected unique_ptr to behave like a raw pointer.
That was your mistake. If you want the behavior of a raw pointer, you should use a raw pointer.
You can fix this code by replacing std::unique_ptr
with raw pointers.
A raw pointer can point to your pics_vals
object, while std::make_unique
creates a pointer to something new and unique.