Skip to content

Commit bd942c3

Browse files
authored
Merge pull request #7 from zhongwencool/v0.5.1
Check if the job's name is duplicated.
2 parents e5a9b86 + 9f807b5 commit bd942c3

File tree

14 files changed

+163
-90
lines changed

14 files changed

+163
-90
lines changed

README.md

Lines changed: 46 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,27 @@
55

66
A lightweight/efficient cron-like job scheduling library for Erlang.
77

8-
Ecron does not poll the system on a minute-by-minute basis like cron does.
9-
All jobs is assigned to a single process, just run as same as the [timer](http://erlang.org/doc/man/timer.html).
8+
All Ecron's jobs is assigned to one single gen_server process, just run as same as the [stdlib's timer](http://erlang.org/doc/man/timer.html).
109

11-
It organize the tasks to be run in a ordered_set ets with the next time to run as key.
10+
It organize the jobs to be run in a ordered_set ets with the next time to run as key.
1211
This way, you only need one process that calculates the time to wait until the next task should be executed,
13-
then spawn the process to execute that task. Saves lots of processes.
12+
then spawn the process to execute that task.
1413
more detail see [Implementation](#Implementation).
1514

16-
This implementation also prevents a lot of messages from flying around.
15+
Ecron does not poll the system on a second-by-second basis like cron does.
16+
The advantage of not doing this is to avoid lots of messages flying around.
17+
18+
All jobs are managed in one process, rather than running one job per process,
19+
which saves lots of processes and avoids taking up a lot of memory.
20+
After all, most of the time the process is waiting(do nothing but eat memory).
1721

1822
It offers:
1923

2024
* Both cron-like scheduling and interval-based scheduling.
21-
* Well tested by `PropTest` ![Coverage Status](https://coveralls.io/repos/github/zhongwencool/ecron/badge.svg?branch=master).
22-
* Use gen_server timeout(`receive after`) at any given time (rather than reevaluating upcoming jobs every second/minute).
23-
* Minimal overhead. ecron aims to keep its code base small.
25+
* Well tested by [PropTest](https://github.com/proper-testing/proper) ![Coverage Status](https://coveralls.io/repos/github/zhongwencool/ecron/badge.svg?branch=master).
26+
* Using gen_server timeout(`receive after`) at any given time (rather than reevaluating upcoming jobs every second/minute).
2427

25-
You can find a collection of general best practices in [Full Erlang Examples](https://github.com/zhongwencool/ecron/blob/master/examples/titan_erlang) and [Full Elixir Examples](https://github.com/zhongwencool/ecron/blob/master/examples/titan_elixir).
28+
You can find a collection of general practices in [Full Erlang Examples](https://github.com/zhongwencool/ecron/blob/master/examples/titan_erlang) and [Full Elixir Examples](https://github.com/zhongwencool/ecron/blob/master/examples/titan_elixir).
2629

2730
## Installation
2831

@@ -66,24 +69,25 @@ You can find a collection of general best practices in [Full Erlang Examples](ht
6669
{no_singleton_job, "@minutely", {timer, sleep, [61000]}, unlimited, unlimited, [{singleton, false}]}
6770
]},
6871
{global_jobs, []}, %% Global Spec has the same format as local_jobs.
69-
{cluster_quorum_size, 1} %% Minimum number of nodes which run ecron. Global_jobs only run on majority cluster when it > ClusterNode/2.
72+
{global_quorum_size, 1} %% Minimum number of nodes which run ecron. Global_jobs only run on majority cluster when it > ClusterNode/2.
7073
}
7174
].
7275
```
7376

74-
* When `time_zone` is `local`, current datetime is [calendar:local_time()](http://erlang.org/doc/man/calendar.html#local_time-0).
75-
* When `time_zone` is `utc`, current datetime is [calendar:universal_time()](http://erlang.org/doc/man/calendar.html#universal_time-0).
76-
* The job will be auto remove at the end of the time.
77+
* Default `time_zone` is `local`, the current datetime is [calendar:local_time()](http://erlang.org/doc/man/calendar.html#local_time-0).
78+
* The current datetime is [calendar:universal_time()](http://erlang.org/doc/man/calendar.html#universal_time-0) when `{time_zone, utc}`.
79+
* The job will be auto remove at `EndDateTime`, the default value of `EndDateTime` is `unlimited`.
7780
* Default job is singleton, Each task cannot be executed concurrently.
7881
* If the system clock suddenly alter a lot(such as sleep your laptop for two hours or modify system time manually),
7982
it will skip the tasks which are supposed be running during the sudden lapse of time,
8083
then recalculate the next running time by the latest system time.
81-
You can also reload task manually by `ecron:reload().`
82-
* Global jobs depend on [global](http://erlang.org/doc/man/global.html), only allowed to be added statically, [check for more detail](https://github.com/zhongwencool/ecron/blob/master/doc/global.md).
84+
You can also reload task manually by `ecron:reload().` when the system time is manually modified.
85+
* Global jobs depend on [global](http://erlang.org/doc/man/global.html), only allowed to be added statically, [check this for more detail](https://github.com/zhongwencool/ecron/blob/master/doc/global.md).
8386

8487
## Advanced Usage
8588

8689
```erlang
90+
%% Same as: Spec = "0 * 0-5,18 * * 0-5",
8791
Spec = #{second => [0],
8892
minute => '*',
8993
hour => [{0,5}, 18], %% same as [0,1,2,3,4,5,18]
@@ -109,7 +113,7 @@ EveryMFA = {io, format, ["Runs every 120 second.~n"]},
109113
```
110114
## Debug Support
111115

112-
There are some function to get information for a Job and to handle the Job and Invocations.
116+
There are some function to get information for debugging jobs.
113117
````erlang
114118
1> ecron:deactivate(CrontabName).
115119
ok
@@ -126,8 +130,8 @@ ok
126130
start_time => unlimited,end_time => unlimited,
127131
failed => 0,mfa => {io,format,["ddd"]},
128132
name => test,status => activate,type => cron,
129-
ok => 0,results => [],run_microsecond => [],
130-
opts => [{singleton,true}],
133+
ok => 1,results => [ok],run_microsecond => [12],
134+
opts => [{singleton,true}], node => 'test@127.0.0.1',
131135
next =>
132136
["2019-09-27T01:00:00+08:00","2019-09-27T13:00:00+08:00",
133137
"2019-09-30T01:00:00+08:00","2019-09-30T13:00:00+08:00",
@@ -153,6 +157,26 @@ ok
153157
type => cron}
154158
}
155159
````
160+
## Implementation
161+
The local_jobs workflow is as follows:
162+
1. `ecron_sup` (supervisor) would Start a standalone gen_server `ecron`, when application starts.
163+
2. Look for configuration `{jobs, Jobs}` when `ecron` process initialization.
164+
3. For each crontab job found, determine the next time in the future that each command must run.
165+
4. Place those commands on the ordered_set ets with their `{Corresponding_time, Name}` to run as key.
166+
5. Enter main loop:
167+
* Examine the task entry at the head of the ets, compute how far in the future it must run.
168+
* Sleep for that period of time by gen_server timeout feature.
169+
* On awakening and after verifying the correct time, execute the task at the head of the ets (spawn in background).
170+
* Delete old key in ets.
171+
* Determine the next time in the future to run this command and place it back on the ets at that time value.
172+
173+
Additionally, `ecron` also collect job's latest 16 results and execute times, you can observe by `ecron:statistic(Name)`.
174+
175+
[Check this for global_jobs workflow](https://github.com/zhongwencool/ecron/blob/master/doc/global.md#Implementation).
176+
177+
## Telemetry
178+
Ecron publish events through telemetry, you can handle those events by [this guide](https://github.com/zhongwencool/ecron/blob/master/doc/telemetry.md),
179+
such as you can monitor events dispatch duration and failures to create alerts which notify somebody.
156180
157181
## CRON Expression Format
158182
@@ -223,10 +247,11 @@ Entry | Description | Equivalent
223247
>You might find something like [https://crontab.guru/](https://crontab.guru/) or [https://cronjob.xyz/](https://cronjob.xyz/) helpful.
224248
>But, note that these don't necessarily accept the exact same syntax as this library,
225249
>for instance, it doesn't accept the seconds field, so keep that in mind.
250+
>The best way to verify the spec format is `ecron:parse_spec("0 0 1 1 1-6 1", 10).`.
226251
227252
## Intervals
228253
229-
You may also schedule a job to execute at fixed intervals, starting at the time it's added or cron is run.
254+
You may also execute job at fixed intervals, starting at the time it's added or cron is run.
230255
This is supported by formatting the cron spec like this:
231256
```shell
232257
@every <duration>
@@ -235,34 +260,10 @@ For example, "@every 1h30m10s" would indicate a schedule that activates after 1
235260

236261
>Note: The interval doesn't take the job runtime into account.
237262
>For example, if a job takes 3 minutes to run, and it is scheduled to run every 5 minutes,
238-
>it will have 5 minutes of idle time between each run.
263+
>it also has 5 minutes of idle time between each run.
239264
240-
## Implementation
241-
242-
1. On application start-up, start a standalone gen_server `ecron` under supervision tree(`ecron_sup`).
243-
2. Look for configuration `{jobs, Jobs}` when ecron process initialization.
244-
3. For each crontab job found, determine the next time in the future that each command must run.
245-
4. Place those commands on the ordered_set ets with their `{Corresponding_time, Name}` to run as key.
246-
5. Enter main loop:
247-
* Examine the task entry at the head of the ets, compute how far in the future it must run.
248-
* Sleep for that period of time by gen_server timeout feature.
249-
* On awakening and after verifying the correct time, execute the task at the head of the ets (spawn in background).
250-
* Delete old key in ets.
251-
* Determine the next time in the future to run this command and place it back on the ets at that time value.
252-
253-
Additionally, this ecron also collect the MFA latest 16 results and execute times, you can observe by `ecron:statistic(Name)`.
254-
255-
## Telemetry
256-
Ecron publish events through telemetry, you can handle those events by [this guide](https://github.com/zhongwencool/ecron/blob/master/doc/telemetry.md),
257-
such as you can monitor events dispatch duration and failures to create alerts which notify somebody.
258-
259-
## Proper Test
260-
265+
## Test
261266
262267
```shell
263268
$ rebar3 do proper -c, ct -c, cover -v
264269
```
265-
266-
## TODO
267-
268-
* support the last day of a month.

changelog.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
1+
### 0.5.1
2+
- Replace `cluster_quorum_size` by `global_quorum_size`.
3+
- Check if job's name is duplicate.
4+
15
### 0.5.0
2-
- support global_jobs by `global`.
3-
- add global/up/down telemetry metrics.
6+
- Support global_jobs by `global`.
7+
- Add global/up/down telemetry metrics.
48

59
### 0.4.0
610

doc/global.md

Lines changed: 64 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,73 @@
11
### Precondition
2-
Because it depends on `global`'s name registration service.
3-
The name server also maintains a fully connected network.
4-
For example, if node N1 connects to node N2 (which is already connected to N3),
5-
the global name servers on the nodes N1 and N3 ensure that also N1 and N3 are connected.
6-
In other words, command-line flag -connect_all false can not be used
2+
3+
1. Fully Connected Cluster
4+
5+
Because it depends on `global`'s name registration service.
6+
The name server must maintains a fully connected network.
7+
8+
For example, if node N1 connects to node N2 (which is already connected to N3),
9+
the global name servers on the nodes N1 and N3 ensure that also N1 and N3 are connected.
10+
In other words, command-line flag `-connect_all false` can not be used.
11+
12+
2. Same Global Configuration
13+
14+
All node's `global_quorum_size` and `global_jobs` must keep the same value.
15+
This ensures that the global task manager can transfer between nodes when the network splits.
716

817
### Configuration
9-
`cluster_quorum_size` - A majority of the ecron must respond, default is 1.
1018

11-
If you want to make sure always one global task manager run in cluster even at brain split.
19+
1. `global_jobs`
20+
21+
the same format as `local_jobs`, default is `[]`.
22+
This means only run local jobs without running global task manager and monitor processes.
23+
24+
2. `global_quorum_size`
25+
26+
ecron application live on at least `global_quorum_size` nodes in the same cluster, can be regarded as a healthy cluster.
27+
28+
Global task manager only running on a healthy cluster.
29+
30+
If you want to guarantee always no more than **one** global task manager even when the cluster has network split,
31+
you should set it to **"half plus one"**. For example:
1232

13-
You should set it to “half plus one”.
33+
Run on majority:
34+
1. `ABC` 3 nodes in one cluster.
35+
2. `global_quorum_size=2`.
36+
3. (`ABC`) cluster split into 2 part(`AB` =|= `C`).
37+
4. the global task manager would run on `AB` cluster(`AB` is the healthy cluster now).
38+
5. `C` node only running local jobs without global jobs.
1439

15-
for example:
40+
Run on majority
41+
1. `ABC` 3 nodes in one cluster.
42+
2. `global_quorum_size=2`.
43+
3. (`ABC`) cluster split into 3 part(`A` =|= `B` =|= `C`).
44+
4. every node only running local jobs without global jobs(all nodes is unhealthy).
1645

17-
### Run on majority
18-
1. Set up 3 nodes in on cluster.
19-
2. `cluster_quorum_size=2`.
20-
3. (`ABC`) nodes cluster split into 2 part(`AB` =/= `C`).
21-
4. the global task manager will run on `AB` cluster.
46+
Run on every node if brain split.
47+
1. `ABC` nodes in one cluster.
48+
2. `global_quorum_size=1`.
49+
3. (`ABC`) cluster split into 3 part(`A` =|= `B` =|= `C`).
50+
4. the global task manager would run on every nodes(we have three healthy cluster now).
51+
5. But the global task manager only running one in the same cluster.
2252

23-
### Don't run
24-
1. Set up 3 nodes in on cluster.
25-
2. `cluster_quorum_size=2`.
26-
3. (`ABC`) nodes cluster split into 3 part(`A` =/= `B` =/= `C`).
27-
4. the global task manager doesn't run.
53+
### Implementation
54+
1. The top supervisor `ecron_sup` start at first.
55+
2. Nothing will happen if the `global_jobs` is empty.
56+
3. When `global_jobs` is not empty, `ecron_sup` would start_link `ecron_monitor` worker (gen_server).
57+
4. `ecron_monitor` subscribes node's up/down messages by [net_kernel:monitor_nodes(true)](http://erlang.org/doc/man/net_kernel.html#monitor_nodes-1), when it initializes.
58+
5. Checking if there is enough `ecron` process in the cluster(`global_quorum_size`).
59+
6. Trying to terminate global job manager process when cluster's `ecron` number less than `global_quorum_size`.
60+
7. Otherwise, trying to start a global job manager process, This gen_server register by [global:register_name/2](http://erlang.org/doc/man/global.html#register_name-2).
61+
8. All the nodes are rushing to register this global jobs manager process, only one node will success, other node's `ecron_monitor` would link this process if the process already exists.
62+
9. The `ecron_monitor` will receive notification, when node down/up or the global job manager dies.
63+
10. Enter step 5 again, When notified.
2864

29-
### Run on every node if brain split.
30-
1. Set up 3 nodes in on cluster.
31-
1. `cluster_quorum_size=1`.
32-
2. (`ABC`) nodes cluster split into 3 part(`A` =/= `B` =/= `C`).
33-
3. the global task manager will run on every nodes.
65+
```
66+
NodeA NodeB NodeC
67+
sup sup sup
68+
| | \ |
69+
monitor | monitor monitor
70+
| | | |
71+
| | link |
72+
|____link____GlobalJob__|____link____|
73+
```

doc/telemetry.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ you can use your own event handler. For example, you can create a module to hand
1111
```erlang
1212
-module(my_ecron_telemetry_logger).
1313
-include_lib("kernel/include/logger.hrl").
14-
-define(Events, [[ecron, success], [ecron, failure], [ecron, activate], [ecron, deactivate], [ecron, delete]]).
14+
-define(Events, [[ecron, success], [ecron, failure], [ecron, activate],
15+
[ecron, deactivate], [ecron, delete],
16+
[ecron, global, up], [ecron, global, down]]).
1517
%% API
1618
-export([attach/0, detach/0]).
1719
-define(TELEMETRY_HANDLE, ecron_telemetry_metrics).
@@ -35,7 +37,11 @@ handle_event([ecron, success], #{run_microsecond := Ms, run_result := Res},
3537
handle_event([ecron, failure], #{run_microsecond := Ms, run_result := {Error, Reason, Stack}},
3638
#{name := Name, mfa := MFA}, _Config) ->
3739
?LOG_ERROR("EcronJob(~p)-~p CRASH in ~p microsecond. {Error, Reason}: {~p, ~p}. Stack:~p",
38-
[Name, MFA, Ms, Error, Reason, Stack]).
40+
[Name, MFA, Ms, Error, Reason, Stack]);
41+
handle_event([ecron, global, up], #{action_ms := Time, reason := Reason}, #{node := Node}, _Config) ->
42+
?LOG_INFO("Ecron Global UP on ~p at -~p ms because of ~p.", [Node, Time, Reason]);
43+
handle_event([ecron, global, down], #{action_ms := Time, reason := Reason}, #{node := Node}, _Config) ->
44+
?LOG_INFO("Ecron Global DOWN on ~p at -~p ms because of ~p.", [Node, Time, Reason]).
3945
```
4046

4147
Once you have a module like this, you can attach it when your application starts:

examples/titan_elixir/config/config.exs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ config :ecron, :local_jobs,
5959
{:no_singleton_job, "@minutely", {Process, :sleep, [61000]}, :unlimited, :unlimited, singleton: false}
6060
]
6161

62-
config :ecron, :cluster_quorum_size, 1
62+
config :ecron, :global_quorum_size, 1
6363
config :ecron, :global_jobs,
6464
[
6565
{:global_crontab_job, "*/15 * * * * *", {StatelessCron, :inspect, ["Runs on 0, 15, 30, 45 seconds"]}},

examples/titan_elixir/mix.exs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,6 @@ defmodule TitanElixir.MixProject do
1616
#
1717
# Run "mix help deps" for examples and options.
1818
defp deps do
19-
[{:ecron, ">= 0.5.0"}]
19+
[{:ecron, ">= 0.5.1"}]
2020
end
2121
end

examples/titan_elixir/mix.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
%{
2-
"ecron": {:hex, :ecron, "0.5.0", "9d457d548c0f7b09acd558270f5ad02bb10a5db89e3d9a109e17fa13620042e8", [:rebar3], [{:telemetry, "~>0.4.0", [hex: :telemetry, repo: "hexpm", optional: false]}], "hexpm"},
2+
"ecron": {:hex, :ecron, "0.5.1", "08a1e05486da327f9277e8c512ad0f7f7a5cc231b52f1c870cee90de319dbb9b", [:rebar3], [{:telemetry, "~>0.4.0", [hex: :telemetry, repo: "hexpm", optional: false]}], "hexpm"},
33
"telemetry": {:hex, :telemetry, "0.4.0", "8339bee3fa8b91cb84d14c2935f8ecf399ccd87301ad6da6b71c09553834b2ab", [:rebar3], [], "hexpm"},
44
}

examples/titan_erlang/config/sys.config

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
{titan, []},
33
{ecron, [
44
{time_zone, local}, %% local or utc
5-
{cluster_quorum_size, 1},
5+
{global_quorum_size, 1},
66
{global_jobs, [
77
{global_crontab_job, "*/15 * * * * *", {stateless_cron, inspect, ["Runs on 0, 15, 30, 45 seconds"]}}
88
]},

examples/titan_erlang/rebar.lock

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
{"1.1.0",
2-
[{<<"ecron">>,{pkg,<<"ecron">>,<<"0.5.0">>},0},
2+
[{<<"ecron">>,{pkg,<<"ecron">>,<<"0.5.1">>},0},
33
{<<"telemetry">>,{pkg,<<"telemetry">>,<<"0.4.0">>},1}]}.
44
[
55
{pkg_hash,[
6-
{<<"ecron">>, <<"9D457D548C0F7B09ACD558270F5AD02BB10A5DB89E3D9A109E17FA13620042E8">>},
6+
{<<"ecron">>, <<"08A1E05486DA327F9277E8C512AD0F7F7A5CC231B52F1C870CEE90DE319DBB9B">>},
77
{<<"telemetry">>, <<"8339BEE3FA8B91CB84D14C2935F8ECF399CCD87301AD6DA6B71C09553834B2AB">>}]}
88
].

src/ecron.app.src

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
{application, ecron,
22
[{description, "cron-like/crontab job scheduling library"},
3-
{vsn, "0.5.0"},
3+
{vsn, "0.5.1"},
44
{registered, [ecron_sup, ecron]},
55
{mod, {ecron_app, []}},
66
{applications, [kernel, stdlib, telemetry]},
77
{env, [
88
{adjusting_time_second, 604800}, %7*24*3600
99
{time_zone, local}, %% local or utc
10-
{cluster_quorum_size, 1}, %% A majority of the nodes must connect.
10+
{global_quorum_size, 1}, %% A majority of the nodes must connect.
1111
{local_jobs, [
1212
%% {JobName, CrontabSpec, {M, F, A}}
1313
%% {JobName, CrontabSpec, {M, F, A}, StartDateTime, EndDateTime}

0 commit comments

Comments
 (0)