I have a xhtml file that is completely tagged via ifrs taxonomy.
I did that by writing a program that reads the concepts of the ifrs taxonomy via Arelle and then you can use them to tag specific elements.
I would like to describe how I understood the whole thing and ask you to correct me if I am wrong
As I understand it, the ifrs taxonomy is the total set of available concepts from which you usually only use a subset for tagging.
Therefore, in addition to the actual xhtml file, there must also be the extension taxonomy.
that means you have the folder structure
the folder www.mycompany.com/xbrl/2025
contains files like
And now to my actual question:
How can I generate those files based on a tagged report?
And is that even the right way to proceed?
Your understanding is correct. The IFRS taxonomy however contains more than the concept definitions: it also has human-readable labels (for ESEF, in many different languages of the Union), pointers and references to the (here IFRS) standard for each concept, and some presentation rules meant to be used by filers to navigate and find the concepts they need. Specifically,
123456abcde-2022-12-31.xsd
contains extension concepts, that is, concepts needed by the filer but for which there was no suitable equivalent in the IFRS taxonomy.123456abcde-2022-12-31_cal.xml
contains calculation rules (e.g., assets = current assets + non-current assets) for validation purposes. These rules will vary across filings because of the specifics of the filer and must thus be provided in the extension taxonomy.123456abcde-2022-12-31_def.xml
contains hypercube definition rules for hypercube validation purposes. Such rules specify the dimensions that a fact is allowed to contain based on the concept it reports.123456abcde-2022-12-31_lab-de.xml
contains German human-readable labels for each extension concept (the labels for the IFRS concepts are in the IFRS taxonomy already).123456abcde-2022-12-31_lab-en.xml
contains English human-readable labels for each extension concept (the labels for the IFRS concepts are in the IFRS taxonomy already).123456abcde-2022-12-31_pre.xml
contains presentation rules, i.e., the hierarchy of concepts corresponding to each section (balance sheet, income statement, cash flow statement, etc).These files should be generated by an XBRL processor, either programmatically or with a UI. It is not recommended to directly generate them as XML because this is very complex and prone to errors. An XBRL processor may also be aware of ESEF and will follow its specific rules.
Generating these files from a tagged report (that is, as I understand you mean, the HTML instance aka Inline XBRL) would be challenging and prone to errors and imprecision. I imagine that many filers or reporting companies have a process in place with a database that both produces the HTML tables and the calculation/definition/presentation networks in a way that is consistent.
An approach commonly used by some processors is to let the user define these hierarchies in an Excel file, and then the processor is able to generate all these files from Excel. There are more advanced processors and architectures that use an underlying database system (document store, relational, graph database, etc) instead of Excel, possibly connected with other accounting systems of the filer. There are also companies that specialize in producing XBRL reports as a service and who provide expertise and advice.
Furthermore, the calculation, hypercube definition and presentation are usually consistent with each other and also generated from a common source called a model structure.