I have a dict with of which the types of the keys and values are fixed. I want to define the types in a TypedDict
as follows:
class MyTable(TypedDict):
caption: List[str]
header: List[str]
table: pd.DataFrame
epilogue: List[str]
I have function that returns a MyTable
. I want to define first an empty (Typed)dict
and fill in keys and values.
def returnsMyTable():
result = {}
result['caption'] = ['caption line 1','caption line 2']
result['header'] = ['header line 1','header line 2']
result['table'] = pd.DataFrame()
result['epilogue'] = ['epilogue line 1','epilogue line 2']
return result
Here MyPy complains that a type annotation for result is needed. I tried the this:
result: MyTable = {}
but then MyPy complains that the keys are missing. Similarly, if I define the keys but set the values to None
, it complains about incorrect types of the values.
Is it at all possible to initialize a TypedDict
as an empty Dict first and fill in the keys and values later? The docs seem to suggest it is.
I guess I could first define the values as variables and assemble the MyTable
later but I'm dealing with legacy code that I'm adding type hinting to. So I'd like to minimize the work.
What you might want here is to set totality, but I'd think twice about using it.
Quoting the PEP
By default, all keys must be present in a TypedDict. It is possible to override this by specifying totality. Here is how to do this using the class-based syntax:
class MyTable(TypedDict, total=False):
caption: List[str]
header: List[str]
table: pd.DataFrame
epilogue: List[str]
result: MyTable = {}
result2: MyTable = {"caption": ["One", "Two", "Three"]}
As I said, think twice about that. A total TypedDict
gives you a very nice guarantee that all of the items will exist. That is, because MyPy won't allow result to exist without "caption", you can safely call cap = result["caption"]
.
If you set total=False, that guarantee goes away. On the assumption that you're using your MyTable
much more commonly than you're making it, getting the additional safety assurances when you use it is probably a good trade.
Personally, I'd reserve total=False
for cases where the creation code sometimes genuinely does leave things out and any code that uses it has to handle that. If it's just a case of taking a few lines to initialise, I'd do it like this:
def returnsMyTable():
result = {}
result_caption = ['caption line 1','caption line 2']
result_header = ['header line 1','header line 2']
result_table = pd.DataFrame()
result_epilogue = ['epilogue line 1','epilogue line 2']
result = {
"caption": result_caption,
"header": result_header,
"table": result_table,
"epilogue": result_epilogue
}
return result