Skip to content

Commit 1513ccc

Browse files
mosabuacolebow
authored andcommitted
Add TCB 71 about Faker connector and datafaker
1 parent 945daf9 commit 1513ccc

File tree

3 files changed

+126
-5
lines changed

3 files changed

+126
-5
lines changed

_data/tools.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,9 @@
503503
url: https://www.datafaker.net/
504504
- urltext: Faker connector documentation
505505
url: /docs/current/connector/faker.html
506+
- urltext: Interview and demo with the Faker connector author Jan Waś in
507+
Trino Community Broadcast 71
508+
url: /episodes/71.html
506509
- name: DBeaver
507510
anchor: DBeaver
508511
category: client-application

_episodes/71.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
---
2+
layout: episode
3+
title: "71: Fake it real good"
4+
date: 2025-02-27
5+
tags: trino faker data generation testing learning SQL datafaker
6+
youtube_id: "RChu61ouynk"
7+
wistia_id: "ll81pqq62a"
8+
sections:
9+
- time: 0:00
10+
title: Introduction with Manfred and Cole
11+
- time: 0:48
12+
title: Trino 471
13+
- time: 7:17
14+
title: Trino Gateway 14
15+
- time: 9:38
16+
title: Welcoming Jan Waś
17+
- time: 11:12
18+
title: First overview of Faker connector
19+
- time: 17:57
20+
title: Connector documentation, configuration, and details
21+
- time: 27:04
22+
title: In-depth demo with workflow tips and discussion
23+
- time: 1:10:07
24+
title: Summary with Jan and chat about other connectors
25+
- time: 1:14:57
26+
title: Recap with Cole and Manfred
27+
- time: 1:15:22
28+
title: AI functions question from the audience
29+
- time: 1:16:47
30+
title: Next episodes and end of show
31+
32+
introduction: |
33+
Manfred Moser and Cole Bowden are joined by Jan Waś to learn about the new
34+
Faker connector and the Datafaker library. You can use it to emulate data that
35+
does not exist on any storage, can shape it as you need, and then learn real
36+
SQL, build real reports, and make some real charts - all with fake data.
37+
---
38+
39+
## Hosts
40+
41+
* [Manfred Moser](https://www.linkedin.com/in/manfredmoser), Director/Open
42+
Source Engineering at [Starburst]({{site.url}}/users.html#starburst) -
43+
[@simpligility](https://x.com/simpligility)
44+
* [Cole Bowden](https://www.linkedin.com/in/cole-m-bowden), Developer Advocate
45+
at [Firebolt](https://www.firebolt.io/)
46+
47+
## Guest
48+
49+
* [Jan Waś](https://www.linkedin.com/in/janwas/),
50+
Software Engineer at [Starburst]({{site.url}}/users.html#starburst)
51+
52+
## Releases
53+
54+
Following are some highlights of the recent releases:
55+
56+
[Trino 471]({{site.baseurl}}/docs/current/release/release-471.html)
57+
58+
* Add [AI functions]({{site.url}}/docs/current/functions/ai.html) for textual
59+
tasks on data using OpenAI, Anthropic, or other LLMs using Ollama as backend.
60+
* Add support for logging output to the console in JSON format (useful in containers..).
61+
* Support additional Python libraries for use with Python user-defined functions.
62+
* Remove the RPM package.
63+
* Add [local file system support]({{site.url}}/docs/current/object-storage/file-system-local.html).
64+
* Add support for S3 Tables in Iceberg connector.
65+
66+
As always, numerous performance improvements, bug fixes, and other features were
67+
added as well.
68+
69+
[Trino Gateway 14](https://trinodb.github.io/trino-gateway/release-notes/#14)
70+
71+
Our first Trino Gateway release of 2025 shipped, and it is packed with great new
72+
features and fixes. Some examples are the following:
73+
74+
* Rules editor in the web interface
75+
* Automatic database schema update and support for Oracle
76+
* Trino cluster monitoring with JMX and OpenMetrics
77+
78+
## Introducing Jan Waś
79+
80+
Jan, also known as [nineinchnick on GitHub](https://github.com/nineinchnick/),
81+
is a very active Trino contributor with a wide range of his own plugins and
82+
projects. He is subproject maintainer for the Helm charts and the Grafana
83+
plugin, and is heavily involved in GitHub actions setup and numerous other
84+
efforts. Jan resides in Poland. When he is not working on Trino, you can find
85+
him at metal, electronics, and even opera concerts across Europe or at home
86+
playing video games.
87+
88+
## Datafaker, Faker connector, and Trino
89+
90+
We talk about using simulated data from the TPC-H and TPC-DS connectors to learn
91+
SQL and use it for other scenarios such as benchmarking, testing for SQL
92+
support, and validating other connectors and data sources. This leads us to the
93+
limitations of these connectors and how the Faker connector is the next step.
94+
95+
<img src="{{site.baseurl}}/assets/images/logos/datafaker-small.png">
96+
97+
Jan tells us about the Datafaker library and his motivation to create a
98+
connector, and how it eventually landed in Trino itself.
99+
100+
## Demo time
101+
102+
Jan shows us how to configure the connector and then demoes a number of use
103+
cases from learning SQL to populating and testing other data sources.
104+
105+
## Resources
106+
107+
* [Faker connector documentation]({{site.baseurl}}/docs/current/connector/faker.html)
108+
* [Datafaker project]({{site.baseurl}}/ecosystem/data-source.html#datafaker)
109+
* [Trino reports repository](https://github.com/trinodb/reports)
110+
* [Other project repositories from Jan](https://github.com/nineinchnick/)
111+
* [Zero-cost reporting, presented at Trino Fest 2023]({% post_url
112+
2023-06-28-trino-fest-2023-starburst-recap %})
113+
114+
## Rounding out
115+
116+
Watch the [recording of the Trino contributor call or read the
117+
minutes](https://github.com/trinodb/trino/wiki/Contributor-meetings).
118+
119+
Join us for upcoming events and let us know if you want to a guest:
120+
121+
* Trino Community Broadcast 72: Keeping the lake clean, all about
122+
[Lakekeeper](https://lakekeeper.io/)
123+
* Trino Community Broadcast 73: Wrapping Trino packages with a bow

broadcast/index.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,6 @@ interesting developments in the ecosystem around Trino.
2020
## Upcoming episodes
2121

2222
<dl>
23-
<dt>27 Feb 2025: Trino Community Broadcast 71 - Fake it real good</dt>
24-
<dd><a href="https://github.com/nineinchnick">Jan Waś</a> teaches us about the
25-
new Faker connector and how you can use it to emulate data that does not exist on
26-
any storage, how you can shape it as you need, and how you can then learn real
27-
SQL, build real reports, and make some real charts - all with fake data.</dd>
2823
<dt>13 Mar 2025: Trino Community Broadcast 72 - Keeping the lake clean</dt>
2924
<dd><a href="https://www.linkedin.com/in/viktor-kessler/">Viktor Kessler</a> and
3025
<a href="https://www.linkedin.com/in/thielc/">Christian Thiel</a> from Vakamo

0 commit comments

Comments
 (0)