|
1 | 1 | GraphFaker |
2 | 2 | ========== |
3 | 3 |
|
4 | | -GraphFaker is a Python library for generating and loading synthetic and real-world graph datasets. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more! |
| 4 | +GraphFaker is a python library for generating and loading synthetic and real-world datasets tailored for graph-based applications. It supports `faker` as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more! |
5 | 5 |
|
6 | 6 | *Note: The authors and GraphGeeks Labs do not hold any responsibility for the correctness of this generator.* |
7 | 7 |
|
@@ -35,14 +35,40 @@ Solution: graphfaker |
35 | 35 | GraphFaker is an open-source Python library designed to generate, load, and export synthetic graph datasets in a user-friendly and configurable way. It enables users to generate graphs tailored to their specific needs, allowing for better experimentation and learning without needing to think about where the data is coming from or how to fetch the data. |
36 | 36 |
|
37 | 37 | Features |
38 | | --------- |
| 38 | +======== |
39 | 39 |
|
40 | 40 | - **Multiple Graph Sources:** |
41 | | - - `faker`: Synthetic social graphs with rich node/edge types |
42 | | - - `osm`: Real-world road networks from OpenStreetMap |
43 | | - - `flights`: Real airline, airport, and flight networks |
| 41 | + - ``faker``: Synthetic “social-knowledge” graphs powered by Faker (people, places, organizations, events, products with rich attributes and relationships) |
| 42 | + - ``osm``: Real-world street networks directly from OpenStreetMap (by place name, address, or bounding box) |
| 43 | + - ``flights``: Flight/airline networks from Bureau of Transportation Statistics (airlines ↔ airports ↔ flight legs, complete with cancellation and delay flags) |
| 44 | + |
| 45 | +- **Unstructured Data Source:** |
| 46 | + - ``WikiFetcher``: Raw Wikipedia page data (title, summary, content, sections, links, references) ready for custom graph or entity recognition |
| 47 | + |
44 | 48 | - **Easy CLI & Python Library** |
45 | 49 |
|
| 50 | + |
| 51 | +*Vision:: To remove friction around data acquisition, letting you focus on algorithms, teaching, or rapid prototyping.* |
| 52 | + |
| 53 | + |
| 54 | +Key Features |
| 55 | +============ |
| 56 | + |
| 57 | +.. list-table:: |
| 58 | + :header-rows: 1 |
| 59 | + :widths: 15 85 |
| 60 | + |
| 61 | + * - Source |
| 62 | + - What It Gives You |
| 63 | + * - **Faker** |
| 64 | + - Synthetic social-knowledge graphs with configurable sizes, weighted and directional relationships. |
| 65 | + * - **OSM** |
| 66 | + - Real road or walking networks via OSMnx under the hood—fetch by place, address, or bounding box; simplify topology; project to UTM. |
| 67 | + * - **Flights** |
| 68 | + - Airline/airport graph from BTS on-time performance data: nodes for carriers, airports, flights; edges for OPERATED_BY, DEPARTS_FROM, ARRIVES_AT; batch or date-range support; subgraph sampling. |
| 69 | + * - **Wikipedia** |
| 70 | + - Raw page dumps (title, summary, content, sections, links, references) as JSON. |
| 71 | + |
46 | 72 | .. note:: |
47 | 73 |
|
48 | 74 | This is still a work in progress (WIP). Includes logging and debugging print statements. Our goal for releasing early is to get feedback and reiterate. |
@@ -78,10 +104,17 @@ Python Library Usage |
78 | 104 | # Synthetic social/knowledge graph |
79 | 105 | g1 = gf.generate_graph(source="faker", total_nodes=200, total_edges=800) |
80 | 106 | # OSM road network |
81 | | - g2 = gf.generate_graph(source="osm", place="Berlin, Germany", network_type="drive") |
| 107 | + g2 = gf.generate_graph(source="osm", place="Chinatown, San Francisco, California", network_type="drive") |
82 | 108 | # Flight network |
83 | 109 | g3 = gf.generate_graph(source="flights", year=2024, month=1) |
84 | 110 |
|
| 111 | + # Fetch Wikipedia page data |
| 112 | + from graphfaker import WikiFetcher |
| 113 | + page = WikiFetcher.fetch_page("Graph theory") |
| 114 | + print(page['summary']) |
| 115 | + print(page['content']) |
| 116 | + WikiFetcher.export_page_json(page, "graph_theory.json") |
| 117 | +
|
85 | 118 | Advanced: Date Range for Flights |
86 | 119 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
87 | 120 |
|
|
0 commit comments