Retrieving specific data not covered in the standard statements #495
Unanswered
Wessel-Jan
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello Dwight,
First of all, thanks for all your work on Edgartools — it genuinely saves me hours of manual data collection and makes this sort of research possible for non-coders like me.
I'm currently trying to replicate some of the long-term identification and valuation analysis described in Phil Oakley’s How to Pick Quality Shares. To do this I need to extract certain raw accounting inputs consistently over at least 10 years. Some of these fields are straightforward, while others are inconsistently tagged across filings. In the long run I’d like to build a screening tool based on these criteria and a dashboard that displays the resulting metrics and ratios for any selected company.
Below I’ve summarized the key metrics I want to calculate and the raw inputs required. My goal is to find a stable way to pull these items across different companies and years.
Revenue generally works well using the multi-year financial data API (historical trend analysis), though there are some odd cases (e.g., LULU 2018).
Docs reference:
https://edgartools.readthedocs.io/en/latest/guides/multi-year-financial-data-api/?h=asset#historical-trend-analysis-endpoints
Operating income is consistently available over time. Normalized operating income (useful for comparing to reported operating income to detect aggressive accounting or heavy exceptional costs) is not widely reported.
ROCE = Operating income / Capital employed
Capital employed = Total assets − Current liabilities
Phil Oakley recommends adding back short-term borrowing, but due to data inconsistency I’ve simplified the formula. This may understate capital employed for firms with high levels of short-term interest-bearing debt.
I want to adjust for off-balance-sheet operating leases.
Hypothetical purchase price (PP)
= Operating lease payments × factor (6–8)
The main challenge: operating lease payments are not consistently available.
Tags attempted:
• us-gaap:OperatingLeasePayments
• us-gaap:PaymentsForOperatingLeases
Lease-adjusted ROCE =
(Operating income + PP × interest-rate estimate)
/ (Total assets − Current liabilities + PP)
To compute this I need reliable access to:
• Operating lease payments
• A consistent way to filter annual 10-K data
I used the “Track Revenue Growth Over Time” examples, combined with .by_period_length(12) and by_form_type('10-K'). Results were still mixed, so I wrote a helper function that attempts both and uses the earliest match. Unfortunately, this is still inconsistent.
Docs reference:
https://edgartools.readthedocs.io/en/latest/guides/company-facts/?h=facts+api
This breaks ROCE into profit margin and capital turnover.
ROCE = (Operating Income / Revenue) × (Revenue / Capital Employed)
All inputs are already mentioned above. I mainly include this to show where the research is going:
• Avoid firms with highly volatile profit margins (often cyclical or high fixed-cost structures).
• Capital turnover can rise either by reducing capital employed (less sustainable) or by increasing sales through new products (more sustainable).
FCFF = CFO + Dividends from Joint ventures - Taxes paid − CAPEX
Inputs needed reliably:
• Cash flow from operations (from cash flow statement; tried: “Net Cash Provided by Operating Activities”)
• Taxes paid (from cashflow statement; e.g., “Taxes Paid”)
• Capital expenditures (mixed success across facts API and historical trends)
• Dividends from Joint ventures Hard to find consistently.
I also want Free Cashflow to Equity (FCFE) to compare against FCFF, since similarity between them indicates low leverage (a desirable trait in Oakley’s framework).
FCFE = CFO − CAPEX + Net debt issued
Additional inputs needed consistently:
• Depreciation & amortization (for D&A / CFO)
• Working-capital movements year over year (to spot firms boosting current profitability via inventory reductions, extending receivables, etc.)
• Inventory-to-turnover ratio
My question
Do you have recommendations for the most reliable tags or Edgartools functions to extract the accounting figures mentioned above across different companies and years?
Or more generally:
Is there a “preferred” set of tags or an Edgartools method that normalizes these items when companies use different taxonomy extensions?
Thanks again — any guidance would be greatly appreciated, and I'm happy to test any code or approaches you recommend.
Best Regards,
Wessel
Beta Was this translation helpful? Give feedback.
All reactions