Skip to content

Commit cb9313e

Browse files
committed
add examples for setting follow and URL rules; update README with usage instructions
1 parent 1dd2b55 commit cb9313e

File tree

2 files changed

+91
-0
lines changed

2 files changed

+91
-0
lines changed

README.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,48 @@ s = s.SetMultiThread(false)
8282
s := sitemap.New().SetMultiThread(false)
8383
```
8484

85+
#### Follow rules
86+
87+
To set the follow rules, use the `SetFollow()` function. It should be specified a `[]string` value.
88+
It is a list of regular expressions. When parsing a sitemap index, only sitemaps with a `loc` that matches one of these expressions will be followed and parsed.
89+
If no follow rules are provided, all sitemaps in the index are followed.
90+
91+
```go
92+
s := sitemap.New()
93+
s.SetFollow([]string{
94+
`\.xml$`,
95+
`\.xml\.gz$`,
96+
})
97+
```
98+
... or ...
99+
```go
100+
s := sitemap.New().SetFollow([]string{
101+
`\.xml$`,
102+
`\.xml\.gz$`,
103+
})
104+
```
105+
106+
#### URL rules
107+
108+
To set the URL rules, use the `SetRules()` function. It should be specified a `[]string` value.
109+
It is a list of regular expressions. Only URLs that match one of these expressions will be included in the final result.
110+
If no rules are provided, all URLs found are included.
111+
112+
```go
113+
s := sitemap.New()
114+
s.SetRules([]string{
115+
`product/`,
116+
`category/`,
117+
})
118+
```
119+
... or ...
120+
```go
121+
s := sitemap.New().SetRules([]string{
122+
`product/`,
123+
`category/`,
124+
})
125+
```
126+
85127
#### Chaining methods
86128

87129
In both cases, the functions return a pointer to the main object of the package, allowing you to chain these setting methods in a fluent interface style:

examples/rules/main.go

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
package main
2+
3+
import (
4+
"fmt"
5+
"github.com/aafeher/go-sitemap-parser"
6+
"log"
7+
)
8+
9+
func main() {
10+
url := "https://versus.com/sitemap_index.xml.gz"
11+
12+
// create new instance, overwrite default configuration and call Parse() with url
13+
s := sitemap.New().SetUserAgent("Mozilla/5.0 (X11; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0").SetFetchTimeout(5).SetMultiThread(false).SetFollow([]string{`/en_phone_162`}).SetRules([]string{"oneplus"})
14+
sm, err := s.Parse(url, nil)
15+
if err != nil {
16+
log.Printf("%v", err)
17+
}
18+
19+
// Print the errors
20+
if sm.GetErrorsCount() > 0 {
21+
log.Println("parsing has errors:")
22+
for i, err := range sm.GetErrors() {
23+
log.Printf("%d: %v", i+1, err)
24+
}
25+
}
26+
27+
// GetURLCount()
28+
count := sm.GetURLCount()
29+
30+
fmt.Printf("Sitemaps of %s contains %d URLs.\n\n", url, count)
31+
32+
// GetURLs()
33+
urlsAll := sm.GetURLs()
34+
35+
for i, u := range urlsAll {
36+
fmt.Printf("%d. url -> Loc: %s", i, u.Loc)
37+
if u.ChangeFreq != nil {
38+
fmt.Printf(", ChangeFreq: %v", u.ChangeFreq)
39+
}
40+
if u.Priority != nil {
41+
fmt.Printf(", Priority: %.1f", *u.Priority)
42+
}
43+
if u.LastMod != nil {
44+
fmt.Printf(", LastMod: %s", u.LastMod.String())
45+
}
46+
fmt.Println()
47+
}
48+
fmt.Println()
49+
}

0 commit comments

Comments
 (0)