Skip to content

Commit d662101

Browse files
authored
Merge pull request #24 from mu373/dev
Add base_seed and change output type
2 parents ad30553 + 57b924b commit d662101

File tree

14 files changed

+346
-263
lines changed

14 files changed

+346
-263
lines changed

README.md

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,10 @@ estimator.fit(data)
4040
result = estimator.get_parameters()
4141

4242
# Get the power law exponent
43-
gamma = result['gamma']
43+
gamma = result.gamma
4444

4545
# Print full results
46-
print(estimator)
46+
print(result)
4747
```
4848

4949
### Using degree sequence from networkx graphs
@@ -63,10 +63,10 @@ estimator.fit(degree)
6363
result = estimator.get_parameters()
6464

6565
# Get the power law exponent
66-
gamma = result['gamma']
66+
gamma = result.gamma
6767

6868
# Print full results
69-
print(estimator)
69+
print(result)
7070
```
7171

7272
## Available Estimators
@@ -96,22 +96,27 @@ The full result can be obtained by `estimator.get_parameters()`, which returns a
9696
- Optimal bandwidths or minimum AMSE fractions
9797

9898
## Example Output
99-
When you `print(estimator)` after fitting, you will get the following output.
99+
When you `print(result)` after fitting, you will get the following output.
100100
```
101-
==================================================
102-
Tail Estimation Results (HillEstimator)
103-
==================================================
104-
105-
Parameters:
106-
--------------------
107-
Optimal order statistic (k*): 26708
108-
Tail index (ξ): 0.3974
109-
Gamma (powerlaw exponent) (γ): 3.5167
110-
111-
Bootstrap Results:
112-
--------------------
113-
First bootstrap minimum AMSE fraction: 0.2744
114-
Second bootstrap minimum AMSE fraction: 0.2745
101+
--------------------------------------------------
102+
Result
103+
--------------------------------------------------
104+
Order statistics: Array of shape (200,) [1.0000, 1.0000, 1.0000, ...]
105+
Tail index estimates: Array of shape (200,) [1614487461647431761920.0000, 1249994621547387551744.0000, 967791073562264862720.0000, ...]
106+
Optimal order statistic (k*): 25153
107+
Tail index (ξ): 0.5942
108+
Power law exponent (γ): 2.6828
109+
Bootstrap Results:
110+
First Bootstrap:
111+
Fraction of order statistics: None
112+
AMSE values: None
113+
H Min: 0.9059
114+
Maximum index: None
115+
Second Bootstrap:
116+
Fraction of order statistics: None
117+
AMSE values: None
118+
H Min: 0.9090
119+
Maximum index: None
115120
```
116121

117122
## Built-in Datasets

docs/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,6 @@ This section provides detailed API documentation for the tailestim package.
88

99
base
1010
estimators/index
11+
result
1112
data
1213

docs/result.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Result Class
2+
==========
3+
4+
.. automodule:: tailestim.estimators.result
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:

docs/usage.rst

Lines changed: 25 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,10 @@ Using Built-in Datasets
3434
3535
# Get the estimated parameters
3636
result = estimator.get_parameters()
37-
gamma = result['gamma']
37+
gamma = result.gamma
3838
3939
# Print full results
40-
print(estimator)
40+
print(result)
4141
4242
Using degree sequence from networkx graphs
4343
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -57,10 +57,10 @@ Using degree sequence from networkx graphs
5757
5858
# Get the estimated parameters
5959
result = estimator.get_parameters()
60-
gamma = result['gamma']
60+
gamma = result.gamma
6161
6262
# Print full results
63-
print(estimator)
63+
print(result)
6464
6565
Available Estimators
6666
------------------
@@ -84,7 +84,7 @@ The package provides several estimators for tail estimation. For details on each
8484
Results
8585
-------
8686

87-
The full result can be obtained by ``estimator.get_parameters()``, which returns a dictionary. This includes:
87+
The full result can be obtained by ``result = estimator.get_parameters()``. You can either print the result, or access individual attributes (e.g., `result.gamma`). The output will include:
8888

8989
- ``gamma``: Power law exponent (γ = 1 + 1/ξ)
9090
- ``xi_star``: Tail index (ξ)
@@ -96,24 +96,31 @@ The full result can be obtained by ``estimator.get_parameters()``, which returns
9696
Example Output
9797
------------
9898

99-
When you ``print(estimator)`` after fitting, you will get the following output:
99+
When you ``print(result)`` after fitting, you will get the following output:
100100

101101
.. code-block:: text
102102
103-
==================================================
104-
Tail Estimation Results (HillEstimator)
105-
==================================================
103+
--------------------------------------------------
104+
Result
105+
--------------------------------------------------
106+
Order statistics: Array of shape (200,) [1.0000, 1.0000, 1.0000, ...]
107+
Tail index estimates: Array of shape (200,) [1614487461647431761920.0000, 1249994621547387551744.0000, 967791073562264862720.0000, ...]
108+
Optimal order statistic (k*): 25153
109+
Tail index (ξ): 0.5942
110+
Power law exponent (γ): 2.6828
111+
Bootstrap Results:
112+
First Bootstrap:
113+
Fraction of order statistics: None
114+
AMSE values: None
115+
H Min: 0.9059
116+
Maximum index: None
117+
Second Bootstrap:
118+
Fraction of order statistics: None
119+
AMSE values: None
120+
H Min: 0.9090
121+
Maximum index: None
106122
107-
Parameters:
108-
--------------------
109-
Optimal order statistic (k*): 26708
110-
Tail index (ξ): 0.3974
111-
Gamma (powerlaw exponent) (γ): 3.5167
112123
113-
Bootstrap Results:
114-
--------------------
115-
First bootstrap minimum AMSE fraction: 0.2744
116-
Second bootstrap minimum AMSE fraction: 0.2745
117124
118125
Built-in Datasets and Custom Data
119126
-------------------------------

examples/example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
result = estimator.get_parameters()
1717

1818
# Get the power law exponent
19-
gamma = result['gamma']
19+
gamma = result.gamma
2020

2121
# Print full results
2222
print(result)

src/tailestim/estimators/base.py

Lines changed: 33 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
"""Base class for tail index estimation."""
22
import numpy as np
33
from abc import ABC, abstractmethod
4-
from typing import Dict, Any, Tuple
4+
from typing import Dict, Any, Tuple, Union
5+
from numpy.random import BitGenerator, SeedSequence, RandomState, Generator
6+
from .result import TailEstimatorResult
57

68
class BaseTailEstimator(ABC):
79
"""Abstract base class for tail index estimation.
@@ -15,12 +17,19 @@ class BaseTailEstimator(ABC):
1517
bootstrap : bool, default=True
1618
Whether to use double-bootstrap for optimal threshold selection.
1719
May not be applicable for all methods.
20+
base_seed: None | SeedSequence | BitGenerator | Generator | RandomState, default=None
21+
Base random seed for reproducibility of bootstrap. Only used for methods with bootstrap.
1822
**kwargs : dict
1923
Additional parameters specific to each estimation method.
2024
"""
2125

22-
def __init__(self, bootstrap: bool = True, **kwargs):
26+
def __init__(
27+
self,
28+
bootstrap: bool = True,
29+
base_seed: Union[None, SeedSequence, BitGenerator, Generator, RandomState] = None,
30+
**kwargs):
2331
self.bootstrap = bootstrap
32+
self.base_seed = base_seed
2433
self.kwargs = kwargs
2534
self.results = None
2635

@@ -52,52 +61,39 @@ def fit(self, data: np.ndarray) -> None:
5261
self.results = self._estimate(ordered_data)
5362

5463
@abstractmethod
55-
def get_parameters(self) -> Dict[str, Any]:
64+
def get_parameters(self) -> TailEstimatorResult:
5665
"""Get the estimated parameters.
5766
5867
Returns
5968
-------
60-
Dict[str, Any]
61-
Dictionary containing the estimated parameters.
69+
TailEstimatorResult
70+
Object containing the estimated parameters.
6271
The structure depends on the specific estimation method.
6372
"""
6473
if self.results is None:
6574
raise ValueError("Model not fitted yet. Call fit() first.")
66-
return {}
75+
return TailEstimatorResult()
6776

6877
def __str__(self) -> str:
69-
"""Format estimation results as a string."""
70-
if self.results is None:
71-
return "Model not fitted yet. Call fit() first."
72-
73-
params = self.get_parameters()
74-
75-
# Create header
76-
header = "=" * 50 + "\n"
77-
header += f"Tail Estimation Results ({self.__class__.__name__})\n"
78-
header += "=" * 50 + "\n\n"
78+
"""Format estimation object as a string."""
79+
# Create a string with the estimator type and fitted status
80+
estim_str = "-" * 50 + "\n"
81+
estim_str += f"Estimator Type: {self.__class__.__name__}\n"
82+
estim_str += "-" * 50 + "\n"
83+
estim_str += f"Fitted: {'Yes' if self.results is not None else 'No'}\n"
7984

80-
# Format main parameters
81-
main_params = "Parameters:\n"
82-
main_params += "-" * 20 + "\n"
85+
# Add the arguments provided
86+
estim_str += "Arguments:\n"
87+
estim_str += f" bootstrap: {self.bootstrap}\n"
88+
estim_str += f" base_seed: {self.base_seed}\n"
8389

84-
# Add method-specific parameter formatting
85-
params_str = self._format_params(params)
90+
# Add any additional kwargs
91+
if self.kwargs:
92+
for key, value in self.kwargs.items():
93+
estim_str += f" {key}: {value}\n"
8694

87-
return header + main_params + params_str
88-
89-
@abstractmethod
90-
def _format_params(self, params: Dict[str, Any]) -> str:
91-
"""Format method-specific parameters as a string.
95+
# If the model is not fitted, return just the estim_str
96+
if self.results is None:
97+
return estim_str + "Model not fitted yet. Call fit() first."
9298

93-
Parameters
94-
----------
95-
params : Dict[str, Any]
96-
Dictionary of parameters to format.
97-
98-
Returns
99-
-------
100-
str
101-
Formatted parameter string.
102-
"""
103-
pass
99+
return estim_str

src/tailestim/estimators/hill.py

Lines changed: 13 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
"""Hill estimator implementation for tail index estimation."""
22
import numpy as np
3-
from typing import Dict, Any, Tuple
3+
from typing import Dict, Any, Tuple, Union
4+
from numpy.random import BitGenerator, SeedSequence, RandomState, Generator
45
from .base import BaseTailEstimator
6+
from .result import TailEstimatorResult
57
from .tail_methods import hill_estimator as hill_estimate
68

79
class HillEstimator(BaseTailEstimator):
@@ -27,6 +29,8 @@ class HillEstimator(BaseTailEstimator):
2729
Flag controlling bootstrap verbosity.
2830
diagn_plots : bool, default=False
2931
Flag to switch on/off generation of AMSE diagnostic plots.
32+
base_seed: None | SeedSequence | BitGenerator | Generator | RandomState, default=None
33+
Base random seed for reproducibility of bootstrap.
3034
"""
3135

3236
def __init__(
@@ -37,9 +41,10 @@ def __init__(
3741
eps_stop: float = 0.99,
3842
verbose: bool = False,
3943
diagn_plots: bool = False,
44+
base_seed: Union[None, SeedSequence, BitGenerator, Generator, RandomState] = None,
4045
**kwargs
4146
):
42-
super().__init__(bootstrap=bootstrap, **kwargs)
47+
super().__init__(bootstrap=bootstrap, base_seed=base_seed, **kwargs)
4348
self.t_bootstrap = t_bootstrap
4449
self.r_bootstrap = r_bootstrap
4550
self.eps_stop = eps_stop
@@ -66,16 +71,17 @@ def _estimate(self, ordered_data: np.ndarray) -> Tuple:
6671
r_bootstrap=self.r_bootstrap,
6772
verbose=self.verbose,
6873
diagn_plots=self.diagn_plots,
69-
eps_stop=self.eps_stop
74+
eps_stop=self.eps_stop,
75+
base_seed=self.base_seed
7076
)
7177

72-
def get_parameters(self) -> Dict[str, Any]:
78+
def get_parameters(self) -> TailEstimatorResult:
7379
"""Get the estimated parameters.
7480
7581
Returns
7682
-------
77-
Dict[str, Any]
78-
Dictionary containing:
83+
TailEstimatorResult
84+
Object containing:
7985
- k_arr: Array of order statistics
8086
- xi_arr: Array of tail index estimates
8187
- k_star: Optimal order statistic (if bootstrap=True)
@@ -116,38 +122,4 @@ def get_parameters(self) -> Dict[str, Any]:
116122
}
117123
})
118124

119-
return params
120-
121-
def _format_params(self, params: Dict[str, Any]) -> str:
122-
"""Format Hill estimator parameters as a string.
123-
124-
Parameters
125-
----------
126-
params : Dict[str, Any]
127-
Dictionary of parameters to format.
128-
129-
Returns
130-
-------
131-
str
132-
Formatted parameter string.
133-
"""
134-
output = ""
135-
136-
if 'k_star' in params:
137-
output += f"Optimal order statistic (k*): {params['k_star']:.0f}\n"
138-
output += f"Tail index (ξ): {params['xi_star']:.4f}\n"
139-
output += f"Gamma (powerlaw exponent) (γ): {params['gamma']:.4f}\n"
140-
141-
if self.bootstrap:
142-
output += "\nBootstrap Results:\n"
143-
output += "-" * 20 + "\n"
144-
bs1 = params['bootstrap_results']['first_bootstrap']
145-
bs2 = params['bootstrap_results']['second_bootstrap']
146-
output += f"First bootstrap minimum AMSE fraction: {bs1['k_min']:.4f}\n"
147-
output += f"Second bootstrap minimum AMSE fraction: {bs2['k_min']:.4f}\n"
148-
else:
149-
output += "Note: No bootstrap results available\n"
150-
output += f"Number of order statistics: {len(params['k_arr'])}\n"
151-
output += f"Range of tail index estimates: [{min(params['xi_arr']):.4f}, {max(params['xi_arr']):.4f}]\n"
152-
153-
return output
125+
return TailEstimatorResult(params)

0 commit comments

Comments
 (0)