I would like to generate a dataframe. In this dataframe, the column "Date" using the timestamp has to be randomly generated. I would like to generate it using the gauss-law. I know the function random.gauss() and I have this code :
from faker import Faker
import pandas as pd
import numpy as np
from datetime import timedelta
fake_parking = [
{'Licence Plate':fake.license_plate(),
'Start_date':fake.date_time_between_dates(datetime_start='-2y', datetime_end='-1d'),
'Duration':fake.time_delta(end_datetime='+30d')
} for x in range(10000)]
df = pd.DataFrame(fake_parking)
Here, I generate random date and I would like that these dates are generated featuring the gauss-law
Considering that the dataframe that one wants to generate has three columns:LicensePlate
, Start Date
and Duration
, one can do the following
import pandas as pd
import random
import datetime as dt
import faker
fake = faker.Faker()
df = pd.DataFrame({
'LicensePlate': [fake.license_plate() for i in range(100)],
'Start Date': [dt.datetime.now() + dt.timedelta(seconds=random.gauss(0, 1000)) for i in range(100)],
'Duration': [dt.timedelta(seconds=random.gauss(0, 1000)) for i in range(100)]
})
[Out]:
LicensePlate Start Date Duration
0 XV 5129 2022-10-18 12:59:29.287650 0 days 00:24:58.640538
1 91-60124 2022-10-18 13:21:41.058608 -1 days +23:43:29.201520
2 733TBH 2022-10-18 13:26:30.057752 -1 days +23:43:59.308018
3 955 YJB 2022-10-18 13:48:31.069223 0 days 00:08:14.982752
4 0-82573 2022-10-18 13:00:43.735401 0 days 00:02:33.887666
.. ... ... ...
95 MHS 812 2022-10-18 13:29:13.169237 0 days 00:12:18.462455
96 D66-19E 2022-10-18 13:22:49.714652 -1 days +23:42:44.846897
97 SGW 257 2022-10-18 13:12:32.425996 -1 days +23:47:04.114940
98 K16-80P 2022-10-18 13:42:09.283379 -1 days +23:39:17.864417
99 28-83111 2022-10-18 13:03:26.028862 0 days 00:03:46.996096
Notes:
One is using faker
to generate fake license plates.
To make sure it follows a Normal/Gaussian distribution, one is using random.gauss
. One can adjust the mean and standard deviation accordingly.