Skip to content

Commit e32d78e

Browse files
authored
Merge pull request #6 from dowjones/dev
0.4.1 - Beta release with Time Series improvements
2 parents db5405e + 2aa841c commit e32d78e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+2100
-1372
lines changed

.github/workflows/dev_test_publish.yml

Lines changed: 86 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -20,69 +20,110 @@ env:
2020

2121
jobs:
2222

23-
test:
24-
name: Test
25-
runs-on: ubuntu-latest
26-
steps:
27-
- name: Checkout code
28-
uses: actions/checkout@v3
29-
with:
30-
ref: 'dev'
23+
# test:
24+
# name: Test 🧪
25+
# runs-on: ubuntu-latest
26+
27+
# steps:
28+
# - name: Checkout code
29+
# uses: actions/checkout@v4
30+
# with:
31+
# ref: 'dev'
3132

32-
- name: Set up Python
33-
uses: actions/setup-python@v3
34-
with:
35-
python-version: '3.9'
33+
# - name: Set up Python
34+
# uses: actions/setup-python@v5
35+
# with:
36+
# python-version: '3.10.9'
3637

37-
- name: Install latest PIP
38-
run: |
39-
python -m pip install --upgrade pip >> $GITHUB_STEP_SUMMARY
38+
# - name: Install latest PIP
39+
# run: |
40+
# python -m pip install --upgrade pip
4041

41-
- name: Install Dependencies
42-
run: |
43-
python -m pip install pytest pytest-cov >> $GITHUB_STEP_SUMMARY
42+
# - name: Install Dependencies
43+
# run: |
44+
# python -m pip install pytest pytest-cov
4445

45-
- name: Setup factiva-analytics <DEV> (this repo)
46-
run: |
47-
python -m pip install -e . >> $GITHUB_STEP_SUMMARY
46+
# - name: Setup factiva-analytics <DEV> (this repo)
47+
# run: |
48+
# python -m pip install .
4849

49-
- name: pytest
50-
run: pytest test/ >> $GITHUB_STEP_SUMMARY
50+
# - name: pytest
51+
# run: pytest test/
5152

5253
build:
53-
name: Build and Publish
54+
name: Build 📦
5455
runs-on: ubuntu-latest
55-
needs: [test]
56+
# needs: [test]
57+
permissions:
58+
id-token: write
5659
steps:
5760
- name : Checkout code
58-
uses : actions/checkout@v3
61+
uses : actions/checkout@v4
5962
with:
6063
ref: 'dev'
6164

6265
- name: Set up Python
63-
uses: actions/setup-python@v3
66+
uses: actions/setup-python@v5
6467
with:
65-
python-version: '3.9'
68+
python-version: '3.10.9'
6669

67-
- name: Install latest pip, setuptools, twine + wheel
70+
# changes
71+
72+
- name: Install pypa/build
6873
run: |
69-
python -m pip install --upgrade pip setuptools wheel >> $GITHUB_STEP_SUMMARY
70-
71-
- name: Build wheels
74+
python -m pip install --upgrade build
75+
- name: Build a binary build and a source tarball
7276
run: |
73-
python setup.py bdist_wheel >> $GITHUB_STEP_SUMMARY
74-
python setup.py sdist >> $GITHUB_STEP_SUMMARY
75-
76-
- name: Upload Artifact
77-
uses: actions/upload-artifact@v3
77+
python -m build
78+
- name: Store the distribution packages
79+
uses: actions/upload-artifact@v4
7880
with:
79-
name: Wheel_library
81+
name: python-package-distributions
8082
path: dist/
83+
84+
# - name: Install latest pip, setuptools, twine + wheel
85+
# run: |
86+
# python -m pip install --upgrade pip setuptools wheel >> $GITHUB_STEP_SUMMARY
8187

82-
- name: Publish package to TestPyPI
83-
uses: pypa/gh-action-pypi-publish@release/v1
84-
with:
85-
user: __token__
86-
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
87-
repository_url: https://test.pypi.org/legacy/
88-
skip_existing: true
88+
# - name: Build wheels
89+
# run: |
90+
# python setup.py bdist_wheel >> $GITHUB_STEP_SUMMARY
91+
# python setup.py sdist >> $GITHUB_STEP_SUMMARY
92+
93+
94+
publish-to-testpypi:
95+
name: Publish 📦 to TestPyPI
96+
needs:
97+
- build
98+
runs-on: ubuntu-latest
99+
100+
environment:
101+
name: testpypi
102+
url: https://test.pypi.org/p/factiva-analytics
103+
104+
permissions:
105+
id-token: write
106+
107+
steps:
108+
- name: Download all the dists
109+
uses: actions/download-artifact@v4
110+
with:
111+
name: python-package-distributions
112+
path: dist/
113+
- name: Publish 📦 to TestPyPI
114+
uses: pypa/gh-action-pypi-publish@release/v1
115+
with:
116+
repository-url: https://test.pypi.org/legacy/
117+
verbose: true
118+
skip-existing: true
119+
120+
# - name: GitHub Repo Artifact Upload
121+
# uses: actions/upload-artifact@v4
122+
# with:
123+
# name: Wheel_library
124+
# path: dist/
125+
126+
# - name: Publish package to TestPyPI
127+
# uses: pypa/gh-action-pypi-publish@release/v1
128+
# with:
129+
# repository-url: https://test.pypi.org/legacy/

.github/workflows/main_test_publish.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,12 @@ jobs:
2525
runs-on: ubuntu-latest
2626
steps:
2727
- name: Checkout code
28-
uses: actions/checkout@v3
28+
uses: actions/checkout@v4
2929

3030
- name: Set up Python
3131
uses: actions/setup-python@v3
3232
with:
33-
python-version: '3.9'
33+
python-version: '3.10.9'
3434

3535
- name: Install latest PIP
3636
run: |
@@ -53,12 +53,12 @@ jobs:
5353
needs: [test]
5454
steps:
5555
- name : Checkout code
56-
uses : actions/checkout@v3
56+
uses : actions/checkout@v4
5757

5858
- name: Set up Python
5959
uses: actions/setup-python@v3
6060
with:
61-
python-version: '3.9'
61+
python-version: '3.10.9'
6262

6363
- name: Install latest pip, setuptools, twine + wheel
6464
run: |
@@ -80,4 +80,4 @@ jobs:
8080
with:
8181
user: __token__
8282
password: ${{ secrets.PYPI_API_TOKEN }}
83-
skip_existing: false
83+
skip-existing: false

.readthedocs.yaml

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,23 @@
22
# Read the Docs configuration file
33
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
44

5-
# Required
65
version: 2
76

8-
# Set the version of Python and other tools you might need
97
build:
108
os: ubuntu-22.04
119
tools:
12-
python: "3.9"
10+
python: "3.12"
1311

14-
# Build documentation in the docs/ directory with Sphinx
1512
sphinx:
16-
configuration: docs/source/conf.py
13+
configuration: docs/source/conf.py
1714

18-
# Optionally build your docs in additional formats such as PDF
1915
formats:
20-
- pdf
16+
- pdf
17+
- epub
2118

22-
# Optionally set the version of Python and requirements required to build your docs
2319
python:
2420
install:
2521
- requirements: docs/requirements.txt
2622
- method: pip
2723
path: .
24+

README.rst

Lines changed: 75 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
Dow Jones Factiva Analytics Python Library
22
##########################################
33
.. image:: https://github.com/dowjones/factiva-analytics-python/actions/workflows/main_test_publish.yml/badge.svg
4+
.. image:: https://readthedocs.org/projects/factiva-analytics-python/badge/?version=latest&style=plastic
45

56
This library simplifies the integration to Factiva Analytics API services that delivers premium news content.
67

@@ -10,6 +11,7 @@ The following services are currently implemented.
1011
* **Snapshots**: Allows to run each snapshot creation, monitoring, download and local exploration, in an individual manner. Also allows to run the whole process within a single method.
1112
* **Streams**: In addition to creating and getting stream details, contains the methods to easily implement a stream listener and push the content to other locations appropriate for high-available setups.
1213
* **Taxonomy**: Operations that return taxonomies applied to classify news content.
14+
* **ArticleFetcher**: Gets article's content by unique identifiers (AN), for display purposes only.
1315

1416
Installation
1517
============
@@ -23,68 +25,95 @@ Using Library services
2325
======================
2426
Most Factiva Analytics services are implemented in this library. There may be a delay (commonly weeks) when new features are released and their operations are implemented in this package.
2527

26-
Creating a User Instance and Getting its statistics
27-
---------------------------------------------------
28-
Create `UserKey` instance and retrieve a summary of the account statistics.
28+
Getting Account Information
29+
---------------------------
30+
Create an `AccountInfo` instance that contains a summary of the account's basic information and usage statistics.
2931

3032
.. code-block:: python
3133
32-
from factiva.analytics import UserKey
33-
u = UserKey(
34-
key='abcd1234abcd1234abcd1234abcd1234', # Not needed if the ENV variable FACTIVA_USERKEY is set
35-
stats=True) # Connects to the API and pulls the latest account status
34+
from factiva.analytics import AccountInfo
35+
u = AccountInfo(
36+
user_key='abcd1234abcd1234abcd1234abcd1234' # Not needed if the ENV variable FACTIVA_USERKEY is set
37+
)
3638
print(u)
3739
3840
.. code-block::
3941
40-
<class 'factiva.core.userkey.UserKey'>
41-
|-key = ****************************1234
42-
|-cloud_token = **Not Fetched**
43-
|-account_name = AccName1234
44-
|-account_type = account_with_contract_limits
45-
|-active_products = DNA
46-
|-max_allowed_concurrent_extractions = 5
47-
|-max_allowed_extracted_documents = 200,000
48-
|-max_allowed_extractions = 3
49-
|-currently_running_extractions = 0
50-
|-total_downloaded_bytes = 7,253,890
51-
|-total_extracted_documents = 2,515
52-
|-total_extractions = 1
53-
|-total_stream_instances = 4
54-
|-total_stream_subscriptions = 1
55-
|-enabled_company_identifiers = [{'id': 4, 'name': 'isin'}, {'id': 3, 'name': 'cusip'}, {'id': 1, 'name': 'sedol'}, {'id': 5, 'name': 'ticker_exchange'}]
56-
|-remaining_documents = 197,485
57-
|-remaining_extractions = 2
58-
59-
Snapshots
60-
---------
42+
<'factiva.analytics.AccountInfo'>
43+
├─user_key: <'factiva.analytics.UserKey'>
44+
│ ├─key: ****************************1234
45+
│ └─cloud_token: **********************YKB12sJrkHXX
46+
├─account_name: AccName1234
47+
├─account_type: account_with_contract_limits
48+
├─active_product: DNA
49+
├─max_allowed_extracted_documents: 8,000,000
50+
├─max_allowed_extractions: 20
51+
├─currently_running_extractions: 0
52+
├─total_extracted_documents: 5,493,078
53+
├─total_extractions: 4
54+
├─total_stream_instances: 0
55+
├─total_stream_subscriptions: 0
56+
├─extractions_list: <NotLoaded>
57+
├─streams_list: <NotLoaded>
58+
├─enabled_company_identifiers:
59+
│ ├─[1]: sedol
60+
│ ├─[3]: cusip
61+
│ ├─[4]: isin
62+
│ └─[5]: ticker_exchange
63+
├─remaining_documents: 2,506,922
64+
└─remaining_extractions: 16
65+
66+
67+
Snapshot Explain
68+
----------------
69+
Creates an API request that tests the query and returns the number of matching items in the archive.
70+
71+
.. code-block:: python
72+
73+
from factiva.analytics import SnapshotExplain
74+
my_query = "publication_datetime >= '2023-01-01 00:00:00' AND UPPER(source_code) = 'DJDN'"
75+
my_explain = SnapshotExplain(
76+
user_key='abcd1234abcd1234abcd1234abcd1234', # Not needed if the ENV variable FACTIVA_USERKEY is set
77+
query=my_query)
78+
my_explain.process_job() # This operation can take several seconds to complete
79+
print(my_explain)
80+
81+
.. code-block::
82+
83+
<'factiva.analytics.SnapshotExplain'>
84+
├─user_key: <'factiva.analytics.UserKey'>
85+
│ ├─key: ****************************1234
86+
│ └─cloud_token: **********************YKB12sJrkHXX
87+
├─query: <'factiva.analytics.SnapshotExplainQuery'>
88+
│ ├─where: publication_datetime >= '2023-01-01 00:00:00' AND UPPER(source_code) = 'DJDN'
89+
│ ├─includes: <NotSet>
90+
│ ├─excludes: <NotSet>
91+
│ ├─include_lists: <NotSet>
92+
│ └─exclude_lists: <NotSet>
93+
├─job_response: <'factiva.analytics.SnapshotExplainJobResponse'>
94+
│ ├─job_id: 3ee35a80-0406-4f2b-a999-3e4eb5aa94d8
95+
│ ├─job_link: https://api.dowjones...8/_explain
96+
│ ├─job_state: JOB_STATE_DONE
97+
│ ├─volume_estimate: 2,482,057
98+
│ └─errors: <NoErrors>
99+
└─samples: <NotRetrieved>
100+
101+
102+
Snapshot Extraction
103+
-------------------
61104
Create a new snapshot and download to a local repository just require a few lines of code.
62105

63106
.. code-block:: python
64107
65-
from factiva.analytics import Snapshot
66-
my_query = "publication_datetime >= '2020-01-01 00:00:00' AND LOWER(language_code) = 'en'"
67-
my_snapshot = Snapshot(
108+
from factiva.analytics import SnapshotExtraction
109+
my_query = "publication_datetime >= '2023-01-01 00:00:00' AND UPPER(source_code) = 'DJDN'"
110+
my_snapshot = SnapshotExtraction(
68111
user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable
69112
query=my_query)
70-
my_snapshot.process_extract() # This operation can take several minutes to complete
113+
my_snapshot.process_job() # This operation can take several minutes to complete
71114
72115
After the process completes, the output files are stored in a subfolder named as the Extraction Job ID.
73116

74117
In the previous code a new snapshot is created using my_query as selection criteria and user_key for user authentication. After the job is being validated internally, a Snapshot Id is obtained along with the list of files to download. Files are automatically downloaded to a folder named equal to the snapshot ID, and contents are loaded as a Pandas DataFrame to the variable news_articles. This process may take several minutes, but automates the extraction process significantly.
75118

76-
Streams
77-
-------
78-
Create a stream instance and get the details to configure the stream client and listen the content as it is delivered.
79-
80-
.. code-block:: python
81-
82-
from factiva.analytics import Stream
83119

84-
stream_query = Stream(
85-
user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable
86-
user_key_stats=True,
87-
query="publication_datetime >= '2021-04-01 00:00:00' AND LOWER(language_code)='en' AND UPPER(source_code) = 'DJDN'",
88-
)
89-
90-
print(stream_query.create())
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Article Fetch
2+
=============
3+
4+
ArticleFetch operations tutorial

docs/source/concepts/articleretrieval.rst

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
 (0)