Dimensions redux #579
Replies: 8 comments
-
|
Further data. The below list provides instances where my code can verify that the statement cannot be balanced or complete without this dimensional data. For many of these companies there are more instances like this but my logic is not yet able to verify. I can see it by inspecting the data by hand but that's too laborious. Anyway, this gives you a start and an indication of the size of the problem. ADP, 2020, BS, us-gaap:OperatingLeaseLiabilityNoncurrent |
Beta Was this translation helpful? Give feedback.
-
|
Since releasing 5.7.0 and handling the issues it triggered I have come to the realization that dimension is not the top level flag that maps between what a company displays in their public financials and the underlying breakdown data. For one Statement Of Equity and Comprehensive Income contain dimensioned concepts important to the public display. Then as you note a lot of companies include dimensioned concepts in their balance sheet. The North Star I am chasing with the XBRL work is to satisfy 2 use cases 1 . Seeing the financial statement as close to the actual filing as possible. This satisfies uses who look at the SEC Viewer and compare against edgartools For #2 this means show everything Face Value and BreakdownsThis idea came up during research - please restate it if unclear - a face value is the top level value of a concept as seen in the presentation. This can be a concept with no dimensions or the axis of a dimension that the other axes roll up into. Breakdowns are the segments of the dimensions - like region or product Companies can choose to include
which is what you are finding in your research So for #1 we need to find the face value of the concept. I think we are missing some logic and I have a plan to use some information from the XBRL presentation file to find it. |
Beta Was this translation helpful? Give feedback.
-
|
I'm not completely certain about your dimensions vs. breakdowns discussion. If I understand what your saying, that's not really the problem I'm seeing. To be as clear as possible, if we look at a company like WDAY or SLB for example, it is impossible to:
I've been working on this problem using edgartools for several months now and I still find about 10% of filings where my heuristics can't find correct solutions. What's interesting, and cause for optimism, is that the flat files the SEC produces each month (https://www.sec.gov/data-research/sec-markets-data/financial-statement-notes-data-sets) based on the same xbrl data edgartools downloads are much cleaner in this respect. I'm able to produce correct statements in over 99% of cases using these files. So somehow they are able to "decode" the dimensional complexity to produce good data. |
Beta Was this translation helpful? Give feedback.
-
|
@mpreiss9 Thank you for the extensive research and the detailed list of affected filings. Your work has been invaluable in understanding the scope of this problem. Good News: Fix ImplementedWe've implemented definition linkbase-based dimension filtering that addresses the core issue you identified. The fix will be in v5.7.4 (coming soon - it's merged but not yet released). What ChangedInstead of treating all dimensions the same, EdgarTools now distinguishes:
The classification uses a tiered approach:
Verified ResultsI tested WDAY and SLB with the fix: WDAY 2025 Income Statement: SLB 2024 Income Statement: Both statements now balance correctly. Your List is Now Our Test SuiteWe incorporated many of the companies from your research into our regression tests:
DocumentationI've written comprehensive documentation explaining how dimension handling works:
One Remaining UX IssueYou may notice that "total" rows sometimes show NaN when only dimensional values exist: The values ARE there - you just need to sum them for the total. We're considering auto-calculating these totals in a future release. Your Insight About SEC Flat FilesYou mentioned the SEC Financial Statement Data Sets produce correct statements in 99%+ of cases. This is great validation that the problem IS solvable. I'm working on using these to validate. Next Steps
Thank you again for the detailed research. This kind of community contribution makes EdgarTools better for everyone. Note: This fix addresses the dimensional face value problem. Statement of Equity matrix rendering (GH-572) is a separate issue being tracked for v5.8.0. |
Beta Was this translation helpful? Give feedback.
-
|
Sorry, closed accidentally. Also can you comment on https://github.com/dgunning/edgartools/blob/main/docs/xbrl/statement-view-plan.md |
Beta Was this translation helpful? Give feedback.
-
|
this looks very promising - I can't wait to get my hands on this release! |
Beta Was this translation helpful? Give feedback.
-
|
Fixed in v5.7.4 - Implemented definition linkbase-based dimension filtering |
Beta Was this translation helpful? Give feedback.
-
|
Out of curiosity, are you going to be making this available through the xbrl.current_period path? If not, I will need to rewrite my calls to parse out the current period myself - not terrible but wasted effort if you're going to implement on xbrl.current_period anyway. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Issue Type
Environment
EdgarTools Version: 5.7.1
Python Version: 3.14
Operating System: macOS 14.0
Bug Description
There's been a lot of back on forth on the dimensions issue. Although I think we're heading in the right direction, I don't think we're at the destination yet.
The major remaining problem I want to call out it that using xbrl.statements.income_statement(include_dimensions=False) or
xbrl.statements.income_statement() or xbrl.statements.balance_sheet(include_dimensions=False) or
xbrl.statements.balance_sheet()
will in MANY CASES produce and incomplete and out of balance statement that does not match the printed, reported financial statements. Why? Because some filers choose to report a line item in their financials with ONLY dimensional xbrl lines. Anyone using the default xbrl statement calls needs to be aware that they may have an incomplete statement.
This seems to be an income statement issue. I have not yet found a case like it in the balance sheet.
Reproduction
A small subset of problems
Cost of Goods Sold: BA 2023+, CARR 2020+, CHH 2022, CHRW 2018+, GD 2020+, HII 2020+, INTU 2018+, NOC 2019+, OTIS 2020+, RTX 2019+, SLB 2018+, STE 2019, TT 2019+, UPS 2018, VZ 2020+, WDAY 2019+, GEHC 2024+
SGA: CTAS 2020+
Depreciation/Amortization: EXPD 2018, CHH 2022, MAR 2022
LongTermDebtAndCapitalLeaseObligations: ADP 2023+, CAT 2020+
ContractWithCustomerLiabilityCurrent: BBY 2021+
Goodwill: BSX 2019, IBM 2023+, JKHY 2016+
Intangible assets: JKHY 2016+
PPE: BSX 2019, CSX 2015 +, HLT 2022+
Short term debt: CAT 2020+
Payables: COP 2023, 2024, HLT 2019+( and many other categories), LYB 2023+
Receivables: COP 2023 2024, FIS 2024+, GEHC 2023+, GEV 2024+, LYB 2023+
There are many many more line items and filings with this problem, more than I can list.
There are some line items but these are the common ones.
Relevant Forms: 10-K
Additional Context
If there is going to be a default dimensions=False option, I see two possible paths to improvement:
This issue will be handled using EdgarTools' systematic issue resolution workflow. A reproduction test will be created to verify the fix.
Beta Was this translation helpful? Give feedback.
All reactions