I have a text file (.txt) something like this:
{"accountNumber": "737265056", "customerId": "737265056", "creditLimit": 5000.0, "availableMoney": 5000.0, "transactionDateTime": "2016-08-13T14:27:32", "transactionAmount": 98.55, "merchantName": "Uber", "acqCountry": "US", "merchantCountryCode": "US", "posEntryMode": "02", "posConditionCode": "01", "merchantCategoryCode": "rideshare", "currentExpDate": "06/2023", "accountOpenDate": "2015-03-14", "dateOfLastAddressChange": "2015-03-14", "cardCVV": "414", "enteredCVV": "414", "cardLast4Digits": "1803", "transactionType": "PURCHASE", "echoBuffer": "", "currentBalance": 0.0, "merchantCity": "", "merchantState": "", "merchantZip": "", "cardPresent": false, "posOnPremises": "", "recurringAuthInd": "", "expirationDateKeyInMatch": false, "isFraud": false}
{"accountNumber": "737265056", "customerId": "737265056", "creditLimit": 5000.0, "availableMoney": 5000.0, "transactionDateTime": "2016-10-11T05:05:54", "transactionAmount": 74.51, "merchantName": "AMC #191138", "acqCountry": "US", "merchantCountryCode": "US", "posEntryMode": "09", "posConditionCode": "01", "merchantCategoryCode": "entertainment", "cardPresent": true, "currentExpDate": "02/2024", "accountOpenDate": "2015-03-14", "dateOfLastAddressChange": "2015-03-14", "cardCVV": "486", "enteredCVV": "486", "cardLast4Digits": "767", "transactionType": "PURCHASE", "echoBuffer": "", "currentBalance": 0.0, "merchantCity": "", "merchantState": "", "merchantZip": "", "posOnPremises": "", "recurringAuthInd": "", "expirationDateKeyInMatch": false, "isFraud": false}
{"accountNumber": "737265056", "customerId": "737265056", "creditLimit": 5000.0, "availableMoney": 5000.0, "transactionDateTime": "2016-11-08T09:18:39", "transactionAmount": 7.47, "merchantName": "Play Store", "acqCountry": "US", "merchantCountryCode": "US", "posEntryMode": "09", "posConditionCode": "01", "merchantCategoryCode": "mobileapps", "currentExpDate": "08/2025", "accountOpenDate": "2015-03-14", "dateOfLastAddressChange": "2015-03-14", "cardCVV": "486", "enteredCVV": "486", "cardLast4Digits": "767", "transactionType": "PURCHASE", "echoBuffer": "", "currentBalance": 0.0, "merchantCity": "", "merchantState": "", "merchantZip": "", "cardPresent": false, "posOnPremises": "", "recurringAuthInd": "", "expirationDateKeyInMatch": false, "isFraud": false}
{"accountNumber": "737265056", "customerId": "737265056", "creditLimit": 5000.0, "availableMoney": 5000.0, "transactionDateTime": "2016-12-10T02:14:50", "transactionAmount": 7.47, "merchantName": "Play Store", "acqCountry": "US", "merchantCountryCode": "US", "posEntryMode": "09", "posConditionCode": "01", "merchantCategoryCode": "mobileapps", "currentExpDate": "08/2025", "accountOpenDate": "2015-03-14", "dateOfLastAddressChange": "2015-03-14", "cardCVV": "486", "enteredCVV": "486", "cardLast4Digits": "767", "transactionType": "PURCHASE", "echoBuffer": "", "currentBalance": 0.0, "merchantCity": "", "merchantState": "", "merchantZip": "", "cardPresent": false, "posOnPremises": "", "recurringAuthInd": "", "expirationDateKeyInMatch": false, "isFraud": false}
.
.
.
.
.
.
.
How to load it into a dataframe?
Every line looks to be a separate json object. Using just Python:
import pandas as pd
import json
with open('data.json') as f:
lines = f.readlines()
data = []
for line in lines:
data.append(json.loads(line))
df = pd.DataFrame(data)
df
Looks like
accountNumber customerId creditLimit availableMoney transactionDateTime transactionAmount merchantName acqCountry merchantCountryCode posEntryMode posConditionCode merchantCategoryCode currentExpDate accountOpenDate dateOfLastAddressChange cardCVV enteredCVV cardLast4Digits transactionType echoBuffer currentBalance merchantCity merchantState merchantZip cardPresent posOnPremises recurringAuthInd expirationDateKeyInMatch isFraud
0 737265056 737265056 5000.0 5000.0 2016-08-13T14:27:32 98.55 Uber US US 02 01 rideshare 06/2023 2015-03-14 2015-03-14 414 414 1803 PURCHASE 0.0 False False False
1 737265056 737265056 5000.0 5000.0 2016-10-11T05:05:54 74.51 AMC #191138 US US 09 01 entertainment 02/2024 2015-03-14 2015-03-14 486 486 767 PURCHASE 0.0 True False False
2 737265056 737265056 5000.0 5000.0 2016-11-08T09:18:39 7.47 Play Store US US 09 01 mobileapps 08/2025 2015-03-14 2015-03-14 486 486 767 PURCHASE 0.0 False False False
3 737265056 737265056 5000.0 5000.0 2016-12-10T02:14:50 7.47 Play Store US US 09 01 mobileapps 08/2025 2015-03-14 2015-03-14 486 486 767 PURCHASE 0.0 False False False
If the file were just one json object rather than a new json object every line, you could just use pandas.read_json(file_path)