c++oop design-patterns application-design

I need help in designing my C++ Console application

I have a task to complete.

There are two types of csv files 4000+ both related to each other.
2 types are:
1. Country2.csv
2. Security_Name.csv

Contents of Country2.csv:
Company Name;Security Name;;;;Final NOS;Final FFR

Contents of Security_Name.csv:
Date;Close Price;Volume

There are multiple countries and for each country multiple security files

Now I need to READ them do some CALCULATION and then WRITE the output in another files

READ

Read both the file Country 2.csv and Security.csv and extract all the data from them.

For example :

Read France 2.csv, extract Security_Name, Final NOS, Final FFR

Then Read Security.csv(which matches the Security_Name) and extract
Date, Close Price, Volume
Calculation

Calculations are basically finding Median of the values extracted which is quite simple.

For Example:

Monthly Median Traded Values Daily Traded Value of a Security ... and so on
Write Based on the month I need to sort the output in two different file with following formats:

If Month % 3 = 0

Save It as MONTH_NAME.csv in following format:
Security name; 12-month indicator; 3-month indicator; FOT

Else

Save It as MONTH_NAME.csv in following format:
Security Name; Monthly Median Traded Value Ratio; Number of days Volume > 0

My question is how do I design my application in such a way that it is maintainable and the flow of data throughout the execution is seamless?

Solution

So first thing. Based on the kind of data you are looking to generate, I would probably be looking at moving this data to a SQL db if at all possible. This is "one SQL query" kind of stuff. And far more maintainable than C++ that generates CSV files from CSV files.

Barring that, I would probably look at using datamash and/or perl. On a Windows platform, you could do this through Cygwin or WSL. Probably less maintainable, but so much easier it's not too much of an issue.

That said, if you're looking for something moderately maintainable, C++ could work. The first thing I would do is design my input classes. Data-centric, but it can work. It sounds like you could have a Country class, a Security class, and a SecurityClose class...or something along those lines. You can think about whether a Security class should contain a collection of SecurityClosees (data), or whether the data should just be "loose" and reference the Security it belongs to. Same with the Country->Security relationship.

Once you've decided how all that's going to look, you want something (likely a function) that can tokenize a CSV line. So "1,2,3" gets turned into a vector<string> with the contents "1" "2" "3". Then, each of your input classes should have a constructor or initializer that takes a vector<string> and populates itself. You might need to pass higher level data along too. Like the filename if you want the security data to know which security it belongs to..

That's basically most of the battle there. Once you've pulled your data into sensibly organized classes, the rest should come more easily. And if you run into bumps, hopefully you can ask specific design or implementation questions from there.