Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 103 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,25 @@
# 💹 justETF Scraping

Scrape the [justETF](https://www.justetf.com).
Scrape the [justETF](https://www.justetf.com) - Fork optimize for [LibreFolio](https://github.com/Alfystar/LibreFolio) code.

## 🛠️ Installation

To use justETF scraping package in your project, install the actual version from GitHub:

```shell
pip install git+https://github.com/druzsan/justetf-scraping.git
pip install git+https://github.com/Alfystar/justetf-scraping.git
```

If you are using pipenv:

```shell
pipenv install git+https://github.com/Alfystar/justetf-scraping.git
```

If you are going to play [notebooks](./notebooks) through, use the following installation:

```shell
pip install justetf-scraping[all]@git+https://github.com/druzsan/justetf-scraping.git
pip install justetf-scraping[all]@git+https://github.com/Alfystar/justetf-scraping.git
```

## 🚀 Usage
Expand Down Expand Up @@ -371,13 +377,30 @@ df = justetf_scraping.load_overview(strategy="epg-longOnly", index="MSCI World")

### 📈 Scrape ETF Chart Data from justETF ([e.g.](https://www.justetf.com/en/etf-profile.html?isin=IE00B0M62Q58#chart))

#### Get raw chart data with `query_chart`

Get the raw JSON response from justETF API:

```python
data = justetf_scraping.query_chart("IE00B0M62Q58")
# Returns dict with: latestQuote, latestQuoteDate, price, performance, series, etc.
```

#### Load and process chart data with `load_chart`

Load the whole history of a chosen ETF by its ISIN:

```python
df = justetf_scraping.load_chart("IE00B0M62Q58")
df
```

You can also include the current day's value in the response (useful for up-to-date data):

```python
df = justetf_scraping.load_chart("IE00B0M62Q58", addCurrentValue=True)
```

<table>
<thead>
<tr style="text-align: right;">
Expand Down Expand Up @@ -549,9 +572,9 @@ df = justetf_scraping.compare_charts(
{
"IE00B0M62Q58": justetf_scraping.load_chart("IE00B0M62Q58"),
"IE00B0M63177": justetf_scraping.load_chart("IE00B0M63177"),
},
},
input_value="quote_with_dividends"
)
)
df
```

Expand Down Expand Up @@ -628,6 +651,81 @@ df
</table>
<p>7057 rows × 2 columns</p>

### 🔍 Scrape ETF Profile Data (NEW!)

Get comprehensive ETF profile data including description, holdings allocation by country and sector, and real-time quotes from gettex.

#### Get complete ETF overview with `get_etf_overview`

```python
overview = justetf_scraping.get_etf_overview("IE00B3RBWM25")

# Access basic info
print(f"Name: {overview['name']}")
print(f"TER: {overview['ter']}%")
print(f"Fund Size: EUR {overview['fund_size_eur']}m")
print(f"Description: {overview['description']}")

# Access country allocation (full list, not truncated)
for country in overview['countries']:
print(f" {country['name']}: {country['percentage']}%")

# Access sector allocation (full list)
for sector in overview['sectors']:
print(f" {sector['name']}: {sector['percentage']}%")

# Access top 10 holdings with their ISINs
for holding in overview['top_holdings']:
print(f" {holding['name']} ({holding['isin']}): {holding['percentage']}%")

# Access real-time gettex quote
quote = overview['gettex']
print(f"Bid: {quote['bid']} {quote['currency']}")
print(f"Ask: {quote['ask']} {quote['currency']}")
print(f"Day Change: {quote['day_change_percent']}%")
```

The `get_etf_overview` function returns a dictionary with:

| Field | Type | Description |
|-----------------------|-------|----------------------------------|
| `isin` | str | ISIN code |
| `name` | str | ETF name |
| `description` | str | Short description |
| `index` | str | Tracked index |
| `ter` | float | Total Expense Ratio (e.g., 0.19) |
| `fund_size_eur` | float | Fund size in EUR millions |
| `replication` | str | Replication method |
| `fund_currency` | str | Fund currency |
| `distribution_policy` | str | Distributing/Accumulating |
| `inception_date` | str | Launch date |
| `fund_domicile` | str | Country of domicile |
| `countries` | list | Full country allocation |
| `sectors` | list | Full sector allocation |
| `top_holdings` | list | Top 10 holdings with ISINs |
| `gettex` | dict | Real-time quote data |

#### Get real-time gettex quote only with `get_gettex_quote`

```python
quote = justetf_scraping.get_gettex_quote("IE00B3RBWM25")

print(f"Bid: {quote['bid']} EUR")
print(f"Ask: {quote['ask']} EUR")
print(f"Spread: {quote['spread_percent']}%")
print(f"Day Change: {quote['day_change_percent']}%")
print(f"Timestamp: {quote['timestamp']}") # datetime object
```

#### Get raw gettex data with `get_gettex_quote_raw`

```python
raw_data = justetf_scraping.get_gettex_quote_raw("IE00B3RBWM25")
# Returns the raw JSON response from the WebSocket
```

For a complete example, see [test_scrape.py](notebooks/test_scrape.py)

For further exploration examples, see [Jupyter Notebooks](notebooks/)

## ⚒️ Development Setup
Expand Down
17 changes: 15 additions & 2 deletions justetf_scraping/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,21 @@
"""
Scrape the [justETF](https://www.justetf.com).
"""
from .charts import load_chart, compare_charts
from .charts import load_chart, compare_charts, query_chart
from .overview import load_overview
from .etf_profile import (
get_etf_overview,
get_gettex_quote,
get_gettex_quote_raw,
)


__all__ = ["load_chart", "compare_charts", "load_overview"]
__all__ = [
"query_chart",
"load_chart",
"compare_charts",
"load_overview",
"get_etf_overview",
"get_gettex_quote",
"get_gettex_quote_raw",
]
106 changes: 81 additions & 25 deletions justetf_scraping/charts.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"reduceData": "false",
"includeDividends": "false",
"features": "DIVIDENDS",
}
}


def parse_series(raw_series: Dict, value_name: str = "value") -> pd.DataFrame:
Expand All @@ -37,7 +37,56 @@ def relative(series: pd.Series) -> pd.Series:
return 100 * (series / series.iloc[0] - 1)


def load_chart(isin: str, currency: Currency = "EUR") -> pd.DataFrame:
def query_chart(isin: str, currency: Currency = "EUR") -> dict:
"""
:param isin:
:param currency:
:return: dictionary with this structure:
{
"latestQuote": {
"raw": 9.94, "localized": "9.94"
},
"latestQuoteDate": "2025-12-16",
"price": {
"raw": 9.96, "localized": "9.96"
},
"performance": {
"raw": 3.86, "localized": "3.86"
},
"prevDaySeries": [],
"series": [
{
"date": "2025-11-15",
"value": {
"raw": 9.59, "localized": "9.59"
}
},
{
"date": "2025-12-15",
"value": {
"raw": 9.96, "localized": "9.96"
}
}
],
"latestDate": "2025-12-15",
"endOfDay": "2025-12-16T21:00:00Z",
"features": {
"DIVIDENDS": []
}
}
"""
url = BASE_URL.format(isin=isin)
response = requests.get(
url,
params={**BASE_PARAMS, "currency": currency},
headers={"User-Agent": USER_AGENT},
)
assert_response_status_ok(response, "chart")
data = response.json()
return data


def load_chart(isin: str, currency: Currency = "EUR", addCurrentValue: bool = False) -> pd.DataFrame:
"""
Get and enrich an ETF chart for the whole time period.

Expand Down Expand Up @@ -70,14 +119,7 @@ def load_chart(isin: str, currency: Currency = "EUR") -> pd.DataFrame:
payed out until the given date if they were reinvested
immediately.
"""
url = BASE_URL.format(isin=isin)
response = requests.get(
url,
params={**BASE_PARAMS, "currency": currency},
headers={"User-Agent": USER_AGENT},
)
assert_response_status_ok(response, "chart")
data = response.json()
data = query_chart(isin, currency)

df = parse_series(data["series"], "quote")
df["relative"] = relative(df["quote"])
Expand All @@ -90,26 +132,40 @@ def load_chart(isin: str, currency: Currency = "EUR") -> pd.DataFrame:
df["relative_with_dividends"] = relative(df["quote_with_dividends"])
df["reinvested_dividends"] = 0
for index, row in dividends_df.iterrows():
df["reinvested_dividends"] += (
df["quote"] * row["dividends"] / df.at[index, "quote"]
).mask(df.index < index, 0)
df["reinvested_dividends"] += (df["quote"] * row["dividends"] / df.at[index, "quote"]).mask(df.index < index, 0)
df["quote_with_reinvested_dividends"] = df["quote"] + df["reinvested_dividends"]
df["relative_with_reinvested_dividends"] = relative(
df["quote_with_reinvested_dividends"]
)
df["relative_with_reinvested_dividends"] = relative(df["quote_with_reinvested_dividends"])
if addCurrentValue:
latestQuoteDate = data["latestQuoteDate"]
latestQuote = data["latestQuote"]["raw"]
if latestQuoteDate not in df.index:
first_quote = df["quote"].iloc[0]
first_quote_with_dividends = df["quote_with_dividends"].iloc[0]
first_quote_with_reinvested = df["quote_with_reinvested_dividends"].iloc[0]
new_row = pd.DataFrame({
"quote": [latestQuote],
"relative": [100 * (latestQuote / first_quote - 1)],
"dividends": [0],
"cumulative_dividends": [0],
"quote_with_dividends": [latestQuote],
"relative_with_dividends": [100 * (latestQuote / first_quote_with_dividends - 1)],
"reinvested_dividends": [0],
"quote_with_reinvested_dividends": [latestQuote],
"relative_with_reinvested_dividends": [100 * (latestQuote / first_quote_with_reinvested - 1)]
}, index=[pd.to_datetime(latestQuoteDate)])
df = pd.concat([df, new_row])

df.index.name = "date"
return df

# TODO: add Metadata extraction like countries or Sectors

def compare_charts(
charts: Dict[str, pd.DataFrame],
dates: Literal["shortest", "longest"] = "shortest",
input_value: Literal[
"quote", "quote_with_dividends", "quote_with_reinvested_dividends"
] = "quote_with_dividends",
output_value: Literal["absolute", "relative", "percentage"] = "percentage",
) -> pd.DataFrame:
charts: Dict[str, pd.DataFrame],
dates: Literal["shortest", "longest"] = "shortest",
input_value: Literal["quote", "quote_with_dividends", "quote_with_reinvested_dividends"] = "quote_with_dividends",
output_value: Literal["absolute", "relative", "percentage"] = "percentage",
) -> pd.DataFrame:
longest_chart = max(charts.values(), key=len)
charts_df = pd.DataFrame(index=longest_chart.index)
for isin, chart in charts.items():
Expand All @@ -123,7 +179,7 @@ def compare_charts(
raise ValueError(
f"`dates` argument must be one of 'shortest' or 'longest', but "
f"value '{dates}' received."
)
)

if output_value == "absolute":
return charts_df
Expand All @@ -134,4 +190,4 @@ def compare_charts(
raise ValueError(
f"`output_value` argument must be one of 'absolute', 'relative' or "
f"'percentage', but value '{output_value}' received."
)
)
Loading