This is somewhat of a broad topic, but I will try to pare it to some specific questions.
I was thinking about a certain ~meta~ property in Python where the console representation of many basic datatypes are equivalent to the code used to construct those objects:
l = [1,2,3]
d = {'a':1,'b':2,'c':3}
s = {1,2,3}
t = (1,2,3)
g = "123"
###
>>> l
[1, 2, 3]
>>> d
{'a': 1, 'b': 2, 'c': 3}
>>> s
{1, 2, 3}
>>> t
(1, 2, 3)
>>> g
'123'
So for any of these objects, I could copy the console output into the code to create those structures or assign them to variables.
This doesn't apply to some objects, like functions:
def foo():
pass
f = foo
L = [1,2,3, foo]
###
>>> f
<function foo at 0x00000235950347B8>
>>> L
[1, 2, 3, <function foo at 0x00000235950347B8>]
While the list l
above had this property, the list L
here does not; but this seems to be only b/c L
contains an element which doesn't hold this property. So it seems to me that generally, list
has this property in some way.
This applies to some objects in non-standard libraries as well:
import numpy as np
a = np.array([1,2,3])
import pandas as pd
dr = pd.date_range('01-01-2020','01-02-2020', freq='3H')
###
>>> a
array([1, 2, 3])
>>> dr
DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 03:00:00',
'2020-01-01 06:00:00', '2020-01-01 09:00:00',
'2020-01-01 12:00:00', '2020-01-01 15:00:00',
'2020-01-01 18:00:00', '2020-01-01 21:00:00',
'2020-01-02 00:00:00'],
dtype='datetime64[ns]', freq='3H')
For the numpy
array, the console output matches the code used, provided you have array
in the namespace. For the pandas.date_range
, it's a little bit different because the console output can construct the same object produced created by dr = pd.date_range('01-01-2020','01-02-2020', freq='3H')
, but with different code.
A DataFrame
doesn't hold this property, however using the to_dict()
method converts it into a structure which does hold this property:
import pandas as pd
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6]})
###
>>> df
A B
0 1 4
1 2 5
2 3 6
>>> df.to_dict()
{'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 4, 1: 5, 2: 6}}
>>> pd.DataFrame.from_dict({'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 4, 1: 5, 2: 6}})
A B
0 1 4
1 2 5
2 3 6
An example scenario where this is useful is.....posting on SO! B/c you can convert your DataFrame into a data structure where the text representation can be used to construct that data structure. So if you share the to_dict()
version of your DataFrame with someone, they are getting Python-syntaxed code which can be used to recreate the structure. I have found this to be advantageous over pd.read_clipboard()
in some situations.
Mainly:
Additionally (these are less concretely answerable, I recognize, and can remove if off-topic):
I apologize if this is common knowledge to people, or if I am making a mountain out of a molehill here!
What the console representation of an object is, depends on the way its __repr__()
method is written. So I think most of us would at least understand if you talked about this "property" as the repr
of the object. The method has to return a string but the string's contents are up to the author, so it's impossible to say in general whether the repr
of an object is the same as the code needed to create it. In some cases (such as functions) the code might be too long to be useful. In others (such as recursive structures) there might be no reasonable linear representation.
Reposted as an answer instead of a comment in response to suggestions by participants in the comment thread.