Skip to content

Commit fcb8ade

Browse files
committed
Merge branch 'master' of github.com:TheRundown/goquery into master
2 parents 8cad9cc + a2f3ff7 commit fcb8ade

File tree

15 files changed

+261
-167
lines changed

15 files changed

+261
-167
lines changed

.builds/fedora.yml

Lines changed: 0 additions & 104 deletions
This file was deleted.

.github/FUNDING.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
github: [mna]
2+
custom: ["https://www.buymeacoffee.com/mna"]

.github/dependabot.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
version: 2
2+
updates:
3+
# Maintain dependencies for Go
4+
- package-ecosystem: "gomod"
5+
directory: "/"
6+
schedule:
7+
interval: "daily"

.github/workflows/test.yml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
name: test
2+
on: [push, pull_request]
3+
4+
env:
5+
GOPROXY: https://proxy.golang.org,direct
6+
7+
jobs:
8+
test:
9+
strategy:
10+
matrix:
11+
go-version: [1.18.x, 1.19.x]
12+
os: [ubuntu-latest, macos-latest, windows-latest]
13+
runs-on: ${{ matrix.os }}
14+
15+
steps:
16+
- name: Install Go
17+
uses: actions/setup-go@v2
18+
with:
19+
go-version: ${{ matrix.go-version }}
20+
21+
- name: Checkout code
22+
uses: actions/checkout@v2
23+
24+
- name: Test
25+
run: go test ./... -v -cover

.travis.yml

Lines changed: 0 additions & 31 deletions
This file was deleted.

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright (c) 2012-2016, Martin Angers & Contributors
1+
Copyright (c) 2012-2021, Martin Angers & Contributors
22
All rights reserved.
33

44
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

README.md

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# goquery - a little like that j-thing, only in Go
2-
[![builds.sr.ht status](https://builds.sr.ht/~mna/goquery/commits/fedora.yml.svg)](https://builds.sr.ht/~mna/goquery/commits/fedora.yml?) [![build status](https://secure.travis-ci.org/PuerkitoBio/goquery.svg?branch=master)](http://travis-ci.org/PuerkitoBio/goquery) [![GoDoc](https://godoc.org/github.com/PuerkitoBio/goquery?status.png)](http://godoc.org/github.com/PuerkitoBio/goquery) [![Sourcegraph Badge](https://sourcegraph.com/github.com/PuerkitoBio/goquery/-/badge.svg)](https://sourcegraph.com/github.com/PuerkitoBio/goquery?badge)
2+
3+
[![Build Status](https://github.com/PuerkitoBio/goquery/actions/workflows/test.yml/badge.svg?branch=master)](https://github.com/PuerkitoBio/goquery/actions)
4+
[![Go Reference](https://pkg.go.dev/badge/github.com/PuerkitoBio/goquery.svg)](https://pkg.go.dev/github.com/PuerkitoBio/goquery)
5+
[![Sourcegraph Badge](https://sourcegraph.com/github.com/PuerkitoBio/goquery/-/badge.svg)](https://sourcegraph.com/github.com/PuerkitoBio/goquery?badge)
36

47
goquery brings a syntax and a set of features similar to [jQuery][] to the [Go language][go]. It is based on Go's [net/html package][html] and the CSS Selector library [cascadia][]. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off.
58

@@ -19,7 +22,7 @@ Syntax-wise, it is as close as possible to jQuery, with the same function names
1922

2023
## Installation
2124

22-
Please note that because of the net/html dependency, goquery requires Go1.1+.
25+
Please note that because of the net/html dependency, goquery requires Go1.1+ and is tested on Go1.7+.
2326

2427
$ go get github.com/PuerkitoBio/goquery
2528

@@ -37,6 +40,9 @@ Please note that because of the net/html dependency, goquery requires Go1.1+.
3740

3841
**Note that goquery's API is now stable, and will not break.**
3942

43+
* **2021-10-25 (v1.8.0)** : Add `Render` function to render a `Selection` to an `io.Writer` (thanks [@anthonygedeon](https://github.com/anthonygedeon)).
44+
* **2021-07-11 (v1.7.1)** : Update go.mod dependencies and add dependabot config (thanks [@jauderho](https://github.com/jauderho)).
45+
* **2021-06-14 (v1.7.0)** : Add `Single` and `SingleMatcher` functions to optimize first-match selection (thanks [@gdollardollar](https://github.com/gdollardollar)).
4046
* **2021-01-11 (v1.6.1)** : Fix panic when calling `{Prepend,Append,Set}Html` on a `Selection` that contains non-Element nodes.
4147
* **2020-10-08 (v1.6.0)** : Parse html in context of the container node for all functions that deal with html strings (`AfterHtml`, `AppendHtml`, etc.). Thanks to [@thiemok][thiemok] and [@davidjwilkins][djw] for their work on this.
4248
* **2020-02-04 (v1.5.1)** : Update module dependencies.
@@ -50,7 +56,7 @@ Please note that because of the net/html dependency, goquery requires Go1.1+.
5056
* **2016-08-28 (v1.0.1)** : Optimize performance for large documents.
5157
* **2016-07-27 (v1.0.0)** : Tag version 1.0.0.
5258
* **2016-06-15** : Invalid selector strings internally compile to a `Matcher` implementation that never matches any node (instead of a panic). So for example, `doc.Find("~")` returns an empty `*Selection` object.
53-
* **2016-02-02** : Add `NodeName` utility function similar to the DOM's `nodeName` property. It returns the tag name of the first element in a selection, and other relevant values of non-element nodes (see godoc for details). Add `OuterHtml` utility function similar to the DOM's `outerHTML` property (named `OuterHtml` in small caps for consistency with the existing `Html` method on the `Selection`).
59+
* **2016-02-02** : Add `NodeName` utility function similar to the DOM's `nodeName` property. It returns the tag name of the first element in a selection, and other relevant values of non-element nodes (see [doc][] for details). Add `OuterHtml` utility function similar to the DOM's `outerHTML` property (named `OuterHtml` in small caps for consistency with the existing `Html` method on the `Selection`).
5460
* **2015-04-20** : Add `AttrOr` helper method to return the attribute's value or a default value if absent. Thanks to [piotrkowalczuk][piotr].
5561
* **2015-02-04** : Add more manipulation functions - Prepend* - thanks again to [Andrew Stone][thatguystone].
5662
* **2014-11-28** : Add more manipulation functions - ReplaceWith*, Wrap* and Unwrap - thanks again to [Andrew Stone][thatguystone].
@@ -79,7 +85,7 @@ jQuery often has many variants for the same function (no argument, a selector st
7985

8086
Utility functions that are not in jQuery but are useful in Go are implemented as functions (that take a `*Selection` as parameter), to avoid a potential naming clash on the `*Selection`'s methods (reserved for jQuery-equivalent behaviour).
8187

82-
The complete [godoc reference documentation can be found here][doc].
88+
The complete [package reference documentation can be found here][doc].
8389

8490
Please note that Cascadia's selectors do not necessarily match all supported selectors of jQuery (Sizzle). See the [cascadia project][cascadia] for details. Invalid selector strings compile to a `Matcher` that fails to match any node. Behaviour of the various functions that take a selector string as argument follows from that fact, e.g. (where `~` is an invalid selector string):
8591

@@ -123,12 +129,11 @@ func ExampleScrape() {
123129
}
124130

125131
// Find the review items
126-
doc.Find(".sidebar-reviews article .content-block").Each(func(i int, s *goquery.Selection) {
127-
// For each item found, get the band and title
128-
band := s.Find("a").Text()
129-
title := s.Find("i").Text()
130-
fmt.Printf("Review %d: %s - %s\n", i, band, title)
131-
})
132+
doc.Find(".left-content article .post-title").Each(func(i int, s *goquery.Selection) {
133+
// For each item found, get the title
134+
title := s.Find("a").Text()
135+
fmt.Printf("Review %d: %s\n", i, title)
136+
})
132137
}
133138

134139
func main() {
@@ -149,6 +154,8 @@ func main() {
149154
- [Geziyor](https://github.com/geziyor/geziyor), a fast web crawling & scraping framework for Go. Supports JS rendering.
150155
- [Pagser](https://github.com/foolin/pagser), a simple, easy, extensible, configurable HTML parser to struct based on goquery and struct tags.
151156
- [stitcherd](https://github.com/vhodges/stitcherd), A server for doing server side includes using css selectors and DOM updates.
157+
- [goskyr](https://github.com/jakopako/goskyr), an easily configurable command-line scraper written in Go.
158+
- [goGetJS](https://github.com/davemolk/goGetJS), a tool for extracting, searching, and saving JavaScript files (with optional headless browser).
152159

153160
## Support
154161

@@ -161,8 +168,9 @@ There are a number of ways you can support the project:
161168
* Pull requests: please discuss new code in an issue first, unless the fix is really trivial.
162169
- Make sure new code is tested.
163170
- Be mindful of existing code - PRs that break existing code have a high probability of being declined, unless it fixes a serious issue.
164-
165-
If you desperately want to send money my way, I have a BuyMeACoffee.com page:
171+
* Sponsor the developer
172+
- See the Github Sponsor button at the top of the repo on github
173+
- or via BuyMeACoffee.com, below
166174

167175
<a href="https://www.buymeacoffee.com/mna" target="_blank"><img src="https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png" alt="Buy Me A Coffee" style="height: 41px !important;width: 174px !important;box-shadow: 0px 3px 2px 0px rgba(190, 190, 190, 0.5) !important;-webkit-box-shadow: 0px 3px 2px 0px rgba(190, 190, 190, 0.5) !important;" ></a>
168176

@@ -177,10 +185,10 @@ The [BSD 3-Clause license][bsd], the same as the [Go language][golic]. Cascadia'
177185
[bsd]: http://opensource.org/licenses/BSD-3-Clause
178186
[golic]: http://golang.org/LICENSE
179187
[caslic]: https://github.com/andybalholm/cascadia/blob/master/LICENSE
180-
[doc]: http://godoc.org/github.com/PuerkitoBio/goquery
188+
[doc]: https://pkg.go.dev/github.com/PuerkitoBio/goquery
181189
[index]: http://api.jquery.com/index/
182190
[gonet]: https://github.com/golang/net/
183-
[html]: http://godoc.org/golang.org/x/net/html
191+
[html]: https://pkg.go.dev/golang.org/x/net/html
184192
[wiki]: https://github.com/PuerkitoBio/goquery/wiki/Tips-and-tricks
185193
[thatguystone]: https://github.com/thatguystone
186194
[piotr]: https://github.com/piotrkowalczuk

bench_property_test.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,6 @@ func BenchmarkHtml(b *testing.B) {
4646
sel := DocW().Find("h2")
4747
b.StartTimer()
4848
for i := 0; i < b.N; i++ {
49-
sel.Html()
49+
_, _ = sel.Html()
5050
}
5151
}

bench_traversal_test.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ package goquery
22

33
import (
44
"testing"
5+
6+
"github.com/andybalholm/cascadia"
57
)
68

79
func BenchmarkFind(b *testing.B) {
@@ -800,3 +802,21 @@ func BenchmarkClosestNodes(b *testing.B) {
800802
b.Fatalf("want 2, got %d", n)
801803
}
802804
}
805+
806+
func BenchmarkSingleMatcher(b *testing.B) {
807+
doc := Doc()
808+
multi := cascadia.MustCompile(`div`)
809+
single := SingleMatcher(multi)
810+
b.ResetTimer()
811+
812+
b.Run("multi", func(b *testing.B) {
813+
for i := 0; i < b.N; i++ {
814+
_ = doc.FindMatcher(multi)
815+
}
816+
})
817+
b.Run("single", func(b *testing.B) {
818+
for i := 0; i < b.N; i++ {
819+
_ = doc.FindMatcher(single)
820+
}
821+
})
822+
}

example_test.go

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,3 +80,31 @@ func ExampleNewDocumentFromReader_string() {
8080

8181
// Output: Header
8282
}
83+
84+
func ExampleSingle() {
85+
html := `
86+
<html>
87+
<body>
88+
<div>1</div>
89+
<div>2</div>
90+
<div>3</div>
91+
</body>
92+
</html>
93+
`
94+
doc, err := goquery.NewDocumentFromReader(strings.NewReader(html))
95+
if err != nil {
96+
log.Fatal(err)
97+
}
98+
99+
// By default, the selector string selects all matching nodes
100+
multiSel := doc.Find("div")
101+
fmt.Println(multiSel.Text())
102+
103+
// Using goquery.Single, only the first match is selected
104+
singleSel := doc.FindMatcher(goquery.Single("div"))
105+
fmt.Println(singleSel.Text())
106+
107+
// Output:
108+
// 123
109+
// 1
110+
}

0 commit comments

Comments
 (0)