|
| 1 | +--- |
| 2 | +layout: episode |
| 3 | +title: "71: Fake it real good" |
| 4 | +date: 2025-02-27 |
| 5 | +tags: trino faker data generation testing learning SQL datafaker |
| 6 | +youtube_id: "RChu61ouynk" |
| 7 | +wistia_id: "ll81pqq62a" |
| 8 | +sections: |
| 9 | +- time: 0:00 |
| 10 | + title: Introduction with Manfred and Cole |
| 11 | +- time: 0:48 |
| 12 | + title: Trino 471 |
| 13 | +- time: 7:17 |
| 14 | + title: Trino Gateway 14 |
| 15 | +- time: 9:38 |
| 16 | + title: Welcoming Jan Waś |
| 17 | +- time: 11:12 |
| 18 | + title: First overview of Faker connector |
| 19 | +- time: 17:57 |
| 20 | + title: Connector documentation, configuration, and details |
| 21 | +- time: 27:04 |
| 22 | + title: In-depth demo with workflow tips and discussion |
| 23 | +- time: 1:10:07 |
| 24 | + title: Summary with Jan and chat about other connectors |
| 25 | +- time: 1:14:57 |
| 26 | + title: Recap with Cole and Manfred |
| 27 | +- time: 1:15:22 |
| 28 | + title: AI functions question from the audience |
| 29 | +- time: 1:16:47 |
| 30 | + title: Next episodes and end of show |
| 31 | + |
| 32 | +introduction: | |
| 33 | + Manfred Moser and Cole Bowden are joined by Jan Waś to learn about the new |
| 34 | + Faker connector and the Datafaker library. You can use it to emulate data that |
| 35 | + does not exist on any storage, can shape it as you need, and then learn real |
| 36 | + SQL, build real reports, and make some real charts - all with fake data. |
| 37 | +--- |
| 38 | + |
| 39 | +## Hosts |
| 40 | + |
| 41 | +* [Manfred Moser](https://www.linkedin.com/in/manfredmoser), Director/Open |
| 42 | + Source Engineering at [Starburst]({{site.url}}/users.html#starburst) - |
| 43 | + [@simpligility](https://x.com/simpligility) |
| 44 | +* [Cole Bowden](https://www.linkedin.com/in/cole-m-bowden), Developer Advocate |
| 45 | + at [Firebolt](https://www.firebolt.io/) |
| 46 | + |
| 47 | +## Guest |
| 48 | + |
| 49 | +* [Jan Waś](https://www.linkedin.com/in/janwas/), |
| 50 | +Software Engineer at [Starburst]({{site.url}}/users.html#starburst) |
| 51 | + |
| 52 | +## Releases |
| 53 | + |
| 54 | +Following are some highlights of the recent releases: |
| 55 | + |
| 56 | +[Trino 471]({{site.baseurl}}/docs/current/release/release-471.html) |
| 57 | + |
| 58 | +* Add [AI functions]({{site.url}}/docs/current/functions/ai.html) for textual |
| 59 | + tasks on data using OpenAI, Anthropic, or other LLMs using Ollama as backend. |
| 60 | +* Add support for logging output to the console in JSON format (useful in containers..). |
| 61 | +* Support additional Python libraries for use with Python user-defined functions. |
| 62 | +* Remove the RPM package. |
| 63 | +* Add [local file system support]({{site.url}}/docs/current/object-storage/file-system-local.html). |
| 64 | +* Add support for S3 Tables in Iceberg connector. |
| 65 | + |
| 66 | +As always, numerous performance improvements, bug fixes, and other features were |
| 67 | +added as well. |
| 68 | + |
| 69 | +[Trino Gateway 14](https://trinodb.github.io/trino-gateway/release-notes/#14) |
| 70 | + |
| 71 | +Our first Trino Gateway release of 2025 shipped, and it is packed with great new |
| 72 | +features and fixes. Some examples are the following: |
| 73 | + |
| 74 | +* Rules editor in the web interface |
| 75 | +* Automatic database schema update and support for Oracle |
| 76 | +* Trino cluster monitoring with JMX and OpenMetrics |
| 77 | + |
| 78 | +## Introducing Jan Waś |
| 79 | + |
| 80 | +Jan, also known as [nineinchnick on GitHub](https://github.com/nineinchnick/), |
| 81 | +is a very active Trino contributor with a wide range of his own plugins and |
| 82 | +projects. He is subproject maintainer for the Helm charts and the Grafana |
| 83 | +plugin, and is heavily involved in GitHub actions setup and numerous other |
| 84 | +efforts. Jan resides in Poland. When he is not working on Trino, you can find |
| 85 | +him at metal, electronics, and even opera concerts across Europe or at home |
| 86 | +playing video games. |
| 87 | + |
| 88 | +## Datafaker, Faker connector, and Trino |
| 89 | + |
| 90 | +We talk about using simulated data from the TPC-H and TPC-DS connectors to learn |
| 91 | +SQL and use it for other scenarios such as benchmarking, testing for SQL |
| 92 | +support, and validating other connectors and data sources. This leads us to the |
| 93 | +limitations of these connectors and how the Faker connector is the next step. |
| 94 | + |
| 95 | +<img src="{{site.baseurl}}/assets/images/logos/datafaker-small.png"> |
| 96 | + |
| 97 | +Jan tells us about the Datafaker library and his motivation to create a |
| 98 | +connector, and how it eventually landed in Trino itself. |
| 99 | + |
| 100 | +## Demo time |
| 101 | + |
| 102 | +Jan shows us how to configure the connector and then demoes a number of use |
| 103 | +cases from learning SQL to populating and testing other data sources. |
| 104 | + |
| 105 | +## Resources |
| 106 | + |
| 107 | +* [Faker connector documentation]({{site.baseurl}}/docs/current/connector/faker.html) |
| 108 | +* [Datafaker project]({{site.baseurl}}/ecosystem/data-source.html#datafaker) |
| 109 | +* [Trino reports repository](https://github.com/trinodb/reports) |
| 110 | +* [Other project repositories from Jan](https://github.com/nineinchnick/) |
| 111 | +* [Zero-cost reporting, presented at Trino Fest 2023]({% post_url |
| 112 | + 2023-06-28-trino-fest-2023-starburst-recap %}) |
| 113 | + |
| 114 | +## Rounding out |
| 115 | + |
| 116 | +Watch the [recording of the Trino contributor call or read the |
| 117 | +minutes](https://github.com/trinodb/trino/wiki/Contributor-meetings). |
| 118 | + |
| 119 | +Join us for upcoming events and let us know if you want to a guest: |
| 120 | + |
| 121 | +* Trino Community Broadcast 72: Keeping the lake clean, all about |
| 122 | + [Lakekeeper](https://lakekeeper.io/) |
| 123 | +* Trino Community Broadcast 73: Wrapping Trino packages with a bow |
0 commit comments