Skip to content

Commit 2136458

Browse files
committed
chore: fixed merge conflict
2 parents 23ab7e2 + 4ceb58c commit 2136458

File tree

15 files changed

+493
-58
lines changed

15 files changed

+493
-58
lines changed

.github/workflows/python-publish.yml

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
# separate terms of service, privacy policy, and support
77
# documentation.
88

9-
name: Upload Python Package
9+
name: graphfaker Package build
1010

1111
on:
1212
release:
@@ -38,33 +38,33 @@ jobs:
3838
name: release-dists
3939
path: dist/
4040

41-
pypi-publish:
42-
runs-on: ubuntu-latest
43-
needs:
44-
- release-build
45-
permissions:
46-
# IMPORTANT: this permission is mandatory for trusted publishing
47-
id-token: write
41+
# pypi-publish:
42+
# runs-on: ubuntu-latest
43+
# needs:
44+
# - release-build
45+
# permissions:
46+
# # IMPORTANT: this permission is mandatory for trusted publishing
47+
# id-token: write
4848

49-
# Dedicated environments with protections for publishing are strongly recommended.
50-
# For more information, see: https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment#deployment-protection-rules
51-
environment:
52-
name: pypi
53-
# OPTIONAL: uncomment and update to include your PyPI project URL in the deployment status:
54-
url: https://pypi.org/p/graphfaker
55-
#
56-
# ALTERNATIVE: if your GitHub Release name is the PyPI project version string
57-
# ALTERNATIVE: exactly, uncomment the following line instead:
58-
# url: https://pypi.org/project/YOURPROJECT/${{ github.event.release.name }}
49+
# # Dedicated environments with protections for publishing are strongly recommended.
50+
# # For more information, see: https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment#deployment-protection-rules
51+
# environment:
52+
# name: pypi
53+
# # OPTIONAL: uncomment and update to include your PyPI project URL in the deployment status:
54+
# url: https://pypi.org/p/graphfaker
55+
# #
56+
# # ALTERNATIVE: if your GitHub Release name is the PyPI project version string
57+
# # ALTERNATIVE: exactly, uncomment the following line instead:
58+
# # url: https://pypi.org/project/YOURPROJECT/${{ github.event.release.name }}
5959

60-
steps:
61-
- name: Retrieve release distributions
62-
uses: actions/download-artifact@v4
63-
with:
64-
name: release-dists
65-
path: dist/
60+
# steps:
61+
# - name: Retrieve release distributions
62+
# uses: actions/download-artifact@v4
63+
# with:
64+
# name: release-dists
65+
# path: dist/
6666

67-
- name: Publish release distributions to PyPI
68-
uses: pypa/gh-action-pypi-publish@release/v1
69-
with:
70-
packages-dir: dist/
67+
# - name: Publish release distributions to PyPI
68+
# uses: pypa/gh-action-pypi-publish@release/v1
69+
# with:
70+
# packages-dir: dist/

AUTHORS.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ Credits
55
Development Lead
66
----------------
77

8-
* Dennis Irorere <[email protected]>
8+
* Dennis Irorere
99

1010
Contributors
1111
------------
12+
* Emmanuel Jolaiya
1213

13-
None yet. Why not be the first?

HISTORY.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,15 @@ History
66
------------------
77

88
* First release on PyPI.
9+
10+
0.2.0 (2025-06-08)
11+
------------------
12+
GraphFaker v0.2.0 – June 2025
13+
14+
This release expands GraphFaker’s scope with a new data sources to support graph construction and entity recognition tutorials:
15+
16+
* Wikipedia fetcher (WikiFetcher)
17+
- Retrieve raw page data (title, summary, content, sections, links, references) via the wikipedia package
18+
- Export JSON dumps of article fields
19+
20+
Upgrade now to effortlessly pull in unstructured Wikipedia data

README.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# GraphFaker
22

3-
GraphFaker is a Python library for generating, and loading synthetic and real-world graph datasets. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more!
3+
GraphFaker is a Python library for generating and loading synthetic and real-world datasets tailored for graph-based applications. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more!
44

55
*Note: The authors and graphgeeks labs do not hold any responsibility for the correctness of this generator.*
66

@@ -24,11 +24,25 @@ GraphFaker is an open-source Python library designed to generate, load, and expo
2424

2525
## Features
2626
- **Multiple Graph Sources:**
27-
- `faker`: Synthetic social graphs with rich node/edge types
28-
- `osm`: Real-world road networks from OpenStreetMap
29-
- `flights`: Real airline, airport, and flight networks
27+
- `faker`: Synthetic “social-knowledge” graphs powered by Faker (people, places, organizations, events, products with rich attributes and relationships)
28+
- `osm`: Real-world street networks directly from OpenStreetMap (by place name, address, or bounding box)
29+
- `flights`: Flight/airline networks from Bureau of Transportation Statistics (airlines ↔ airports ↔ flight legs, complete with cancellation and delay flags)
30+
- **Unstructured Data Source:**
31+
- `WikiFetcher`: Raw Wikipedia page data (title, summary, content, sections, links, references) ready for custom graph or RAG pipelines
3032
- **Easy CLI & Python Library**
3133

34+
This removes friction around data acquisition, letting you focus on algorithms, teaching or rapid prototyping.
35+
36+
## ✨ Key Features
37+
38+
| Source | What It Gives You |
39+
| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
40+
| **Faker** | Synthetic social-knowledge graphs with configurable sizes, weighted and directional relationships. |
41+
| **OSM** | Real road or walking networks via OSMnx under the hood—fetch by place, address, or bounding box; simplify topology; project to UTM. |
42+
| **Flights** | Airline/airport graph from BTS on-time performance data: nodes for carriers, airports, flights; edges for OPERATED\_BY, DEPARTS\_FROM, ARRIVES\_AT; batch or date-range support; subgraph sampling. |
43+
| **WikiFetcher** | Raw page dumps (title, summary, content, sections, links, references) as JSON |
44+
45+
3246
---
3347

3448
*Disclaimer: This is still a work in progress (WIP). With logging and debugging print statement. Our goal for releasing early is to get feedback and reiterate.*
@@ -62,10 +76,17 @@ gf = GraphFaker()
6276
# Synthetic social/knowledge graph
6377
g1 = gf.generate_graph(source="faker", total_nodes=200, total_edges=800)
6478
# OSM road network
65-
g2 = gf.generate_graph(source="osm", place="Berlin, Germany", network_type="drive")
79+
g2 = gf.generate_graph(source="osm", place="Chinatown, San Francisco, California", network_type="drive")
6680
# Flight network
6781
g3 = gf.generate_graph(source="flights", year=2024, month=1)
6882

83+
# Fetch Wikipedia page data
84+
from graphfaker import WikiFetcher
85+
page = WikiFetcher.fetch_page("Graph theory")
86+
print(page['summary'])
87+
print(page['content'])
88+
WikiFetcher.export_page_json(page, "graph_theory.json")
89+
6990
```
7091

7192
#### Advanced: Date Range for Flights

README.rst

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
GraphFaker
22
==========
33

4-
GraphFaker is a Python library for generating and loading synthetic and real-world graph datasets. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more!
4+
GraphFaker is a python library for generating and loading synthetic and real-world datasets tailored for graph-based applications. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more!
55

66
*Note: The authors and GraphGeeks Labs do not hold any responsibility for the correctness of this generator.*
77

@@ -35,14 +35,40 @@ Solution: graphfaker
3535
GraphFaker is an open-source Python library designed to generate, load, and export synthetic graph datasets in a user-friendly and configurable way. It enables users to generate graphs tailored to their specific needs, allowing for better experimentation and learning without needing to think about where the data is coming from or how to fetch the data.
3636

3737
Features
38-
--------
38+
========
3939

4040
- **Multiple Graph Sources:**
41-
- `faker`: Synthetic social graphs with rich node/edge types
42-
- `osm`: Real-world road networks from OpenStreetMap
43-
- `flights`: Real airline, airport, and flight networks
41+
- ``faker``: Synthetic “social-knowledge” graphs powered by Faker (people, places, organizations, events, products with rich attributes and relationships)
42+
- ``osm``: Real-world street networks directly from OpenStreetMap (by place name, address, or bounding box)
43+
- ``flights``: Flight/airline networks from Bureau of Transportation Statistics (airlines ↔ airports ↔ flight legs, complete with cancellation and delay flags)
44+
45+
- **Unstructured Data Source:**
46+
- ``WikiFetcher``: Raw Wikipedia page data (title, summary, content, sections, links, references) ready for custom graph or entity recognition
47+
4448
- **Easy CLI & Python Library**
4549

50+
51+
*Vision:: To remove friction around data acquisition, letting you focus on algorithms, teaching, or rapid prototyping.*
52+
53+
54+
Key Features
55+
============
56+
57+
.. list-table::
58+
:header-rows: 1
59+
:widths: 15 85
60+
61+
* - Source
62+
- What It Gives You
63+
* - **Faker**
64+
- Synthetic social-knowledge graphs with configurable sizes, weighted and directional relationships.
65+
* - **OSM**
66+
- Real road or walking networks via OSMnx under the hood—fetch by place, address, or bounding box; simplify topology; project to UTM.
67+
* - **Flights**
68+
- Airline/airport graph from BTS on-time performance data: nodes for carriers, airports, flights; edges for OPERATED_BY, DEPARTS_FROM, ARRIVES_AT; batch or date-range support; subgraph sampling.
69+
* - **Wikipedia**
70+
- Raw page dumps (title, summary, content, sections, links, references) as JSON.
71+
4672
.. note::
4773

4874
This is still a work in progress (WIP). Includes logging and debugging print statements. Our goal for releasing early is to get feedback and reiterate.
@@ -78,10 +104,17 @@ Python Library Usage
78104
# Synthetic social/knowledge graph
79105
g1 = gf.generate_graph(source="faker", total_nodes=200, total_edges=800)
80106
# OSM road network
81-
g2 = gf.generate_graph(source="osm", place="Berlin, Germany", network_type="drive")
107+
g2 = gf.generate_graph(source="osm", place="Chinatown, San Francisco, California", network_type="drive")
82108
# Flight network
83109
g3 = gf.generate_graph(source="flights", year=2024, month=1)
84110
111+
# Fetch Wikipedia page data
112+
from graphfaker import WikiFetcher
113+
page = WikiFetcher.fetch_page("Graph theory")
114+
print(page['summary'])
115+
print(page['content'])
116+
WikiFetcher.export_page_json(page, "graph_theory.json")
117+
85118
Advanced: Date Range for Flights
86119
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
87120

0 commit comments

Comments
 (0)